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Abstract 



Unipotent flows are well-behaved dynamical systems. In particu- 
lar, Marina Ratner has shown that the closure of every orbit for such 
a flow is of a nice algebraic (or geometric) form. This is known as 
the Ratner Orbit Closure Theorem; the Ratner Measure-Classification 
Theorem and the Ratner Equidistribution Theorem are closely related 
results. After presenting these important theorems and some of their 
consequences, the lectures explain the main ideas of the proof. Some 
algebraic technicalities will be pushed to the background. 

Chapter 1 is the main part of the book. It is intended for a fairly 
general audience, and provides an elementary introduction to the sub- 
ject, by presenting examples that illustrate the theorems, some of their 
applications, and the main ideas involved in the proof. 

Chapter 2 gives an elementary introduction to the theory of en- 
tropy, and proves an estimate used in the proof of Ratncr's Theorems. 
It is of independent interest. 

Chapters 3 and 4 are utilitarian. They present some basic facts of 
crgodic theory and the theory of algebraic groups that are needed in 
the proof. The reader (or lecturer) may wish to skip over them, and 
refer back as necessary. 

Chapter 5 presents a fairly complete (but not entirely rigorous) 
proof of Ratner's Measure-Classification Theorem. Unlike the other 
chapters, it is rather technical. The entropy argument that finishes our 
presentation of the proof is due to G. A. Margulis and G. Tomanov. 
Earlier parts of our argument combine ideas from Ratner's original 
proof with the approach of G. A. Margulis and G. Tomanov. 

The first four chapters can be read independently, and are intended 
to be largely accessible to second- year graduate students. All four are 
needed for Chapter 5. A reader who is familiar with ergodic theory and 
algebraic groups, but not unipotent flows, may skip Chaps. 2, 3, and 4 
entirely, and read only §1.5— §1.8 of Chap. 1 before beginning Chap. 5. 
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Possible lecture schedules 



It is quite reasonable to stop anywhere after §1.5. In particular, a 
single lecture (1-2 hours) can cover the main points of §1.1— §1.5. 

A good selection for a moderate series of lectures would be §1.1 
§1.8 and §5.1, adding §2.1— §2.5 if the audience is not familiar with 
entropy. For a more logical presentation, one should briefly discuss §3.1 
(the Pointwise Ergodic Theorem) before starting §1.5— §1.8. 

Here are suggested guidelines for a longer course: 
§1.1— §1.3: Introduction to Ratner's Theorems (0.5-1.5 hours) 

§1.4: Applications of Ratner's Theorems (optional, 0-1 hour) 
§1.5— §1.6: Shearing and polynomial divergence (1-2 hours) 
§1.7— §1.8: Other basic ingredients of the proof (1-2 hours) 

§1.9: From measures to orbit closures (optional, 0-1 hour) 

§2.1-§2.3: What is entropy? (1-1.5 hours) 
§2.4— §2.5: How to calculate entropy (1-2 hours) 

§2.6: Proof of the entropy estimate (optional, 1-2 hours) 

§3.1: Pointwise Ergodic Theorem (0.5-1.5 hours) 
§3.2: Mautner Phenomenon (optional, 0.5-1.5 hours) 
§3.3: Ergodic decomposition (optional, 0.5-1.5 hours) 
§3.4: Averaging sets (0.5-1.5 hours) 

§4.1— §4.9: Algebraic groups (optional, 0.5-3 hours) 

§5.1: Outline of the proof (0.5-1.5 hours) 
§5.2— §5.7: A fairly complete proof (3-5 hours) 
§5.8— §5.9: Making the proof more rigorous (optional, 1-3 hours) 
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CHAPTER 1 



Introduction to Ratner's Theorems 

1.1. What is Ratner's Orbit Closure Theorem? 

We begin by looking at an elementary example. 

(1.1.1) Example. For convenience, let us use [x] to denote the image 
of a point x £ M" in the n-torus T" = M"/Z"; that is, 

[x] = x + Z n . 

Any vector sel" determines a C°° flow <p t on T™, by 

<p t ([x]) = [x + tv] for x £ R n and t £ R (1.1.2) 

(see Exer. 2). It is well known that the closure of the orbit of each 
point of T™ is a subtorus of T™ (see Exer. 5, or see Exers. 3 and 4 for 
examples). More precisely, for each x £ K n , there is a vector subspace S 
of R", such that 

51) v £ S (so the entire <p t -orbit of [x] is contained in [x + S]), 

52) the image [x + S] of x + S in T" is compact (hence, the image is 
diffeomorphic to T k , for some k £ {0, 1, 2, . . . , n}), and 

53) the </? t -orbit of [a;] is dense in [x + S] (so [a; + S] is the closure of 
the orbit of [a;]). 

In short, the closure of every orbit is a nice, geometric subset of T". 

Ratner's Orbit Closure Theorem is a far-reaching generalization of 
Eg. 1.1.1. Let us examine the building blocks of that example. 

• Note that M. n is a Lie group. That is, it is a group (under vec- 
tor addition) and a manifold, and the group operations are C°° 
functions. 

• The subgroup Z™ is discrete. (That is, it has no accumulation 
points.) Therefore, the quotient space R™/Z" = T" is a manifold. 

Copyright © 2003-2005 Dave Witte Morris. All rights reserved. 
Permission to make copies of these lecture notes for educational or scientific use, 
including multiple copies for classroom or seminar teaching, is granted (without 
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• The quotient space R"/Z™ is compact. 

• The map t i— > tv (which appears in the formula (1.1.2)) is a 
one-parameter subgroup of R" ; that is, it is a G°° group ho- 
momorphism from K to R™. 

Ratner's Theorem allows: 

• the Euclidean space R™ to be replaced by any Lie group G; 

• the subgroup Z™ to be replaced by any discrete subgroup Y 
of G, such that the quotient space T\G is compact; and 

• the map 1 1— > to to be replaced by any unipotent one-parameter 
subgroup it* of G. (The definition of "unipotent" will be explained 
later.) 

Given G, T, and u*, we may define a C°° flow tp t on r\G by 

<p t (Tx)=Txu t for .x e G and t £ R (1.1.3) 

(cf. 1.1.2 and see Exer. 7). We may also refer to <pt as the ^-flow 
on r\G. Ratncr proved that the closure of every <p t -orbit is a nice, 
geometric subset of T\G. More precisely (note the direct analogy with 
the conclusions of Eg. 1.1.1), if we write [x] for the image of x in T\G, 
then, for each x £ G, there is a closed, connected subgroup S of G, 
such that 

SI') {u*}t £ R C S (so the entire <p t -orbit of [x] is contained in [xS]), 

S2') the image [xS] of xS 1 in T\G is compact (hence, diffeomorphic to 
the homogeneous space A\S, for some discrete subgroup A of S), 
and 

S3') the <^? t -orbit of [x] is dense in [xS] (so [xS] is the closure of the 
orbit). 

(1.1.4) Remark. 

1) Recall that T\G = { Tx \ x £ G } is the set of right cosets of T 
in G. We will consistently use right cosets Tx, but all of the results 
can easily be translated into the language of left cosets xT. For 
example, a G°° flow <p' t can be defined on G/Y by ip' t (xY) = u l xY. 

2) It makes no difference whether we write R"/Z" or Z"\R™ for T™, 
because right cosets and left cosets are the same in an abelian 
group. 

(1.1.5) Notation. For a very interesting special case, which will be the 
main topic of most of this chapter, 

• let 

G = SL(2,R) 
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be the group of 2 x 2 real matrices of determinant one; that is 



SL(2,R) = 



a b 
c d 



a, b, c, d e R, 
ad - be = 1 



and 
• define u, a : 



-» SL(2,R) by 
1 0' 



t 1 

Easy calculations show that 



and 



and 



a* = 



(see Exer. 8), so u* and a* are one-parameter subgroups of G. For any 
subgroup r of G, define flows r\ t and -j t on T\G, by 

r]t(Tx) = Txu 1 and j t (Tx) — Txa 1 . 

(1.1.6) Remark. Assume (as usual) that T is discrete and that T\G 
is compact. If G = SL(2, R), then, in geometric terms, 

1) T\G is (essentially) the unit tangent bundle of a compact surface 
of constant negative curvature (see Exer. 10), 

2) 7 t is called the geodesic flow on T\G (see Exer. 11), and 

3) rjt is called the horocycle flow on T\G (see Exer. 11). 

(1.1.7) Definition. A square matrix T is unipotent if 1 is the only 
(complex) eigenvalue of T; in other words, (T — l) n = 0, where n is the 
number of rows (or columns) of T. 

(1.1.8) Example. Because u is a unipotent matrix for every t, we 
say that u l is a unipotent one-parameter subgroup of G. Thus, 
Ratner's Theorem applies to the horocycle flow rj t : the closure of every 
?7(-orbit is a nice, geometric subset of T\G. 

More precisely, algebraic calculations, using properties (ST, S2', 
S3') show that S = G (sec Exer. 13). Thus, the closure of every orbit is 
[G] = T\G. In other words, every ry t -orbit is dense in the entire space 
T\G. 

(1.1.9) Counterexample. In contrast, a* is not a unipotent matrix 
(unless t = 0), so {a*} is not a unipotent onc-parametcr subgroup. 
Therefore, Ratner's Theorem does not apply to the geodesic flow j t . 

Indeed, although we omit the proof, it can be shown that the clo- 
sures of some orbits of j t are very far from being nice, geometric subsets 
of T\G. For example, the closures of some orbits are fractals (nowhere 
close to being a submanifold of T\G). Specifically, for some orbits, if C 
is the closure of the orbit, then some neighborhood (in C) of a point 
in C is homeomorphic to C' x R, where C' is a Cantor set. 
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When we discuss some ideas of Ratner's proof (in §1.5), we will see, 
more clearly, why the flow generated by this diagonal one-parameter 
subgroup behaves so differently from a unipotent flow. 

(1.1.10) Remark. It can be shown fairly easily that almost every 
orbit of the horocycle flow r\ t is dense in [G], and the same is true for 
the geodesic flow j t (cf. 3.2.7 and 3.2.4). Thus, for both of these flows, 
it is easy to see that the closure of almost every orbit is [G] , which is 
certainly a nice manifold. (This means that the fractal orbits of (1.1.9) 
are exceptional; they form a set of measure zero.) The point of Ratner's 
Theorem is that it replaces "almost every" by "every." 

Our assumption that T\G is compact can be relaxed. 

(1.1.11) Definition. Let T be a subgroup of a Lie group G. 

• A measure u on G is left invariant if n{gA) = fJ.(A) for all 
g <G G and all measurable Ad G. Similarly, fi is right invariant 
if u(Ag) = fJb{A) for all g and A. 

• Recall that any Lie group G has a (left) Haar measure; that 
is, there exists a left-invariant (regular) Borel measure u on G. 
Furthermore, u is unique up to a scalar multiple. (There is also a 
measure that is right invariant, but the right-invariant measure 
may not be the same as the left-invariant measure.) 

• A fundamental domain for a subgroup T of a group G is a 
measurable subset T of G, such that 

o YT = G, and 

o "fT n T has measure 0, for all 7 e T \ {e}. 

• A subgroup F of a Lie group G is a lattice if 

o T is discrete, and 

o some (hence, every) fundamental domain for T has finite 
measure (see Exer. 14). 

(1.1.12) Definition. If T is a lattice in G, then there is a unique G- 
invariant probability measure fie on T\G (see Exers. 15, 16, and 17). 
It turns out that hg can be represented by a smooth volume form on 
the manifold T\G. Thus, we may say that T\G has finite volume. We 
often refer to [iq as the Haar measure on T\G. 

(1.1.13) Example. Let 

• G = SL(2,R) and 

• F = SL(2,Z). 
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Figure 1.1A. When SL(2, R) is identified with (a dou- 
ble cover of the unit tangent bundle of) the upper half 
plane H, the shaded region is a fundamental domain 
for SL(2,Z). 

It is well known that T is a lattice in G. For example, a fundamental 
domain T is illustrated in Fig. 1.1 A (see Exer. 18), and an easy cal- 
culation shows that the (hyperbolic) measure of this set is finite (see 
Exer. 19). 

Because compact sets have finite measure, one sees that if T\G is 
compact (and T is discrete]), then T is a lattice in G (see Exer. 21). 
Thus, the following result generalizes our earlier description of Ratner's 
Theorem. Note, however, that the subspace [xS] may no longer be 
compact; it, too, may be a noncompact space of finite volume. 

(1.1.14) Theorem (Ratncr Orbit Closure Theorem). If 

• G is any Lie group, 

• r is any lattice in G, and 

• ip t is any unipotent flow on T\G, 

then the closure of every ip t -orbit is homogeneous. 

(1.1.15) Remark. Here is a more precise statement of the conclusion 
of Ratner's Theorem (1.1.14). 

• Use [x] to denote the image in T\G of an element x of G. 

• Let u l be the unipotent one-parameter subgroup corresponding 
to (f t , so ift([x}) = [Txu*]. 

Then, for each x e G, there is a connected, closed subgroup S of G, 
such that 

1) { u 4 } teR c S, 
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2) the image [xS] of xS in T\G is closed, and has finite S'-invariant 
volume (in other words, (x~ 1 Tx) n S is a lattice in S (see 
Exer. 22)), and 

3) the </? t -orbit of [x] is dense in [xS]. 

(1.1.16) Example. 

• Let G = SL(2, R) and T = SL(2, Z) as in Eg. 1.1.13. 

• Let u l be the usual unipotent one-parameter subgroup of G (as 
in Notn. 1.1.5). 

Algebraists have classified all of the connected subgroups of G that 
contain w*. They are: 

1) {«*}, 

0" 



2) the lower-triangular group 

3) G. 



and 



It turns out that the lower-triangular group does not have a lattice (cf. 
Exer. 13), so we conclude that the subgroup S must be either {it*} 
or G. 

In other words, we have the following dichotomy: 

each orbit of the u'-flow on SL(2, Z)\ SL(2, M) 
is either closed or dense. 

(1.1.17) Example. Let 
. G = SL(3,R), 
• T = SL(3,Z), and 
"1 0" 



t 1 





Some orbits of the u*-flow are closed, and some are dense, but there are 
also intermediate possibilities. For example, SL(2,R) can be embedded 
in the top left corner of SL(3,R): 



SL(2,R) 9 

This induces an embedding 
SL(2,Z)\SL(2,R) 



* * 




C SL(3,1 



SL(3,Z)\SL(3,] 



(1.1.18) 



The image of this embedding is a submanifold, and it is the closure of 
certain orbits of the w*-flow (see Exer. 25). 
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(1.1.19) Remark. Ratner's Theorem (1.1.14) also applies, more gen- 
erally, to the orbits of any subgroup H that is generated by unipotent 
elements, not just a one-dimensional subgroup. (However, if the sub- 
group is disconnected, then the subgroup S of Rem. 1.1.15 may also be 
disconnected. It is true, though, that every connected component of S 
contains an element of H .) 

Exercises for §1.1. 

#1. Show that, in general, the closure of a submanifold may be a 
bad set, such as a fractal. (Ratner's Theorem shows that this 
pathology cannot not appear if the submanifold is an orbit of a 
"unipotent" flow.) More precisely, for any closed subset G of T 2 , 
show there is an injective G°° function /: R — ► T 3 , such that 



where /(R) denotes the closure of the image of /. 

[Hint: Choose a countable, dense subset {c n }??=-oc of C, and choose / 

(carefully!) with f(n) = c„\ 

#2. Show that (1.1.2) defines a G°° flow on T"; that is, 

(a) (po is the identity map, 

(b) f s+t is equal to the composition ip s oip t , for all s, t £ R; and 

(c) the map ip: T™ x R -> T™, defined by <p{x,t) = ip t (x) is G°°. 

#3. Let v = {a, 13) e R 2 . Show, for each x G R 2 , that the closure of 
[x + Rv] is 



#4. Let v = (a, 1,0) € R 3 , with a irrational, and let ipt be the corre- 
sponding flow on T 3 (see 1.1.2). Show that the subtorus T 2 x {0} 
of T 3 is the closure of the y> t -orbit of (0, 0, 0). 

#5. Given x and v in R™, show that there is a vector subspace S 
of R™, that satisfies (SI), (S3), and (S3) of Eg. 1.1.1. 

#6. Show that the subspace S of Exer. 5 depends only on v, not 
on x. (This is a special property of abelian groups; the analogous 
statement is not true in the general setting of Ratner's Theorem.) 

#7. Given 

• a Lie group G, 

• a closed subgroup T of G, and 

• a one-parameter subgroup g l of G, 

show that <p t (Tx) — Txg 1 defines a flow on L\G. 



/(R)n(T 2 x{0})=Gx{0}, 



f M 



if a = (3 = 0, 

if <*//?€ Q {or = 0), 

if a/ (3 £ Q (and /3 7^ 0). 



< [a; + Rw] 
T 2 
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#8. For u l and a* as in Notn. 1.1.5, and all s,t e R, show that 

(a) u s+t — and 

(b) a s+t = a s a*. 

#9. Show that the subgroup {a s } ol SL(2, R) normalizes the subgroup 
{V}. That is, a" s {w*}o s = {u*} for all s. 
#10. Let H~{x + iy^C\y>0}bc the upper half plane (or 
hyperbolic plane), with Riemannian metric (• | •) defined by 

| W) X+i y = -^{V ■ W), 

for tangent vectors v, w € T x+iy H, where U'wis the usual Eu- 
clidean inner product on R 2 = C. 

(a) Show that the formula 

a b" 
c d 



3Z + C 

qz = tor z G n and q 

bz + d 



G SL(2, 



defines an action of SL(2,R) by isometries on H. 

(b) Show that this action is transitive on TL. 

(c) Show that the stabilizer StabsL(2,R) (i) of the point i is 



SO(2) = 



costf, smt 
— sin 9 cos t 



(d) The unit tangent bundle T 1 ^. consists of the tangent vec- 
tors of length 1. By differentiation, we obtain an action of 
SL(2,R) on T lr H.. Show that this action is transitive. 

(e) For any unit tangent vector v € T x Ji, show 

Stab SL (2jR)(u) = ±1. 

Thus, we may identify T X U with SL(2, R)/{±/}. 

(f) It is well known that the geodesies in H are semicircles (or 
lines) that are orthogonal to the real axis. Any v £ T 1 ^. is 
tangent to a unique geodesic. The geodesic flow j t on T 1 H 
moves the unit tangent vector v a distance t along the geo- 
desic it determines. Show, for some vector v (tangent to the 
imaginary axis), that, under the identification of Exer. lOe, 
the geodesic flow 74 corresponds to the flow x 1— > xa l on 
SL(2,R)/{±7}, for some cel. 

(g) The horocycles in Ti are the circles that are tangent to the 
real axis (and the lines that are parallel to the real axis). 
Each v € T x Ji is an inward unit normal vector to a unique 
horocycle H v . The horocycle flow fjt on T 1 H moves the 
unit tangent vector v a distance t (counterclockwise, if t is 
positive) along the corresponding horocycle H v . Show, for 
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Figure LIB. The geodesic flow on H. 




Figure 1.1C. The horocycle flow on H. 

the identification in Exer. lOf, that the horocycle flow corre- 
sponds to the flow x i — ► xu l on SL(2, R)/{±/}. 

#11. Let X be any compact, connected surface of (constant) negative 
curvature —1. We use the notation and terminology of Exer. 10. 
It is known that there is a covering map p: H —* X that is a local 
isometry. Let 

r = { 7 e SL(2, R) I p( 7 z) = p(z) for all z e H }. 

(a) Show that 

(i) T is discrete, and 

(ii) T\G is compact. 

(b) Show that the unit tangent bundle T X X can be identified 
with r\G, in such a way that 

(i) the geodesic flow on T 1 X corresponds to the flow j t on 
T\SL(2,R), and 

(ii) the horocycle flow on T X X corresponds to the flow % 
on T\ SL(2,M). 

#12. Suppose T and H are subgroups of a group G. For x e G, let 
Stab ff (Fx) = {he H \ Txh = Tx} 
be the stabilizer of Tx in H. Show Stab# (rx) = x~ 1 Tx n H. 
#13. Let 
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• G = SL(2,R) 

• S be a connected subgroup of G containing {u*}, and 

• T be a discrete subgroup of G, such that T\G is compact. 
It is known (and you may assume) that 

(a) if dimS* = 2, then S is conjugate to the lower-triangular 
group B, 

(b) if there is a discrete subgroup A of S, such that A\S is com- 
pact, then S is unimodular, that is, the determinant of the 
linear transformation Ads 9 is 1, for each g £ S, and 

(c) I is the only unipotent matrix in T. 

Show that if there is a discrete subgroup A of S, such that 

• A\S is compact, and 

• A is conjugate to a subgroup of T, 
then S = G. 

[Hint: If dimS € {1, 2}, obtain a contradiction.] 

#14. Show that if T is a discrete subgroup of G, then all fundamental 
domains for T have the same measure. In particular, if one fun- 
damental domain has finite measure, then all do. 
[Hint: fi(^A) — fi(A), for all 7 G T, and every subset A of 

#15. Show that if G is unimodular (that is, if the left Haar measure 
is also invariant under right translations) and T is a lattice in G, 
then there is a G-invariant probability measure on T\G. 
[Hint: For A C T\G, define u G (A) = fi({ g G T | Vg G A }) .] 

#16. Show that if T is a lattice in G, then there is a G-invariant prob- 
ability measure on T\G. 

[Hint: Use the uniqueness of Haar measure to show, for fie as in 
Exer. 15 and g G G, that there exists A(g) G K + , such that 
Ha(Ag) = A(g) na(A) for all A C T\G. Then show A(g) = 1.] 

#17. Show that if T is a lattice in G, then the G-invariant probability 
measure /ig on T\G is unique. 

[Hint: Use ^ig to define a G-invariant measure on G, and use the 
uniqueness of Haar measure.] 

#18. Let 

• G = SL(2,R), 

• T = SL(2,Z), 

• J" = {z e H I |z| > 1 and -1/2 < Rcz < 1/2}, and 

• ei = (1,0) and e 2 = (0,1), 
and define 

• B: G — > R 2 by 2? (5) = (.g T ei, g T e 2 ), where f/ T denotes the 
transpose of 5, 
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• G: R 2 

• C-.G- 



^Cby C(x,y) 
Cby 



x + iy, and 



CO?) 



C(g T e 2 ) 

c{g T e 1 y 



Show: 



(a) 



C(G) 



n 



(b) £ induces a homeomorphism (: H ^> H, denned by ((gi) = 



(c) C(7.9) = 7C(ff), for all g e G and 7 e T, 

(d) for g,h e G, there exists 7 G T, such that 75 = /i if and only 
if (g T ei 1 g T e 2 }z = (^ T ei, /i T e 2 )z, where (wi,W2)z denotes the 
abelian group consisting of all integral linear combinations 
of V\ and v 2 , 

(e) for jeG, there exist Vi,v 2 G (g T ei,g T e 2 }z, such that 

(i) (ui,t;2)z = (g T ei,g T e 2 )z, and 

(ii) C(«2)C(wi) G T, 

(f) I\F = H, 

(g) if 7 G T \ {±J}, then -fj 7 n J 7 has measure 0, and 

(h) { g G G I 52 G JF} is is a fundamental domain for F in G. 
[Hint: Choose vi and «2 to be a nonzero vectors of minimal length in 
(<; T ei, p T e2)z and {g T ei,g r e2)z \ Z«i, respectively.] 



(a) the area element on the hyperbolic plane H is dA = y 2 dxdy, 



(b) the fundamental domain in Fig. 1.1 A has finite hyperbolic 
area. 

[Hint: We have J a °° J 6 C j/ -2 da; dy < 00.] 
#20. Show that if 

• T is a discrete subgroup of a Lie group G, 

• F is a measurable subset of G, 

• TF = G, and 

• 11(F) < 00, 

then r is a lattice in G. 
#21. Show that if 

• T is a discrete subgroup of a Lie group G, and 

• F\G is compact, 
then T is a lattice in G. 

[Hint: Show there is a compact subset C of G, such that FC = G, and 
use Exer. 20.] 



C(g), 



#19 



Show: 



and 



12 1 . Introduction to Ratner's Theorems 

#22. Suppose 

• T is a discrete subgroup of a Lie group G, and 

• S is a closed subgroup of G. 

Show that if the image [xS] of xS in T\G is closed, and has finite 
S'-invariant volume, then (x~ 1 Tx) n S is a lattice in S. 

#23. Let 

• L be a lattice in a Lie group G, 

• {x n } be a sequence of elements of G. 

Show that [x n ] has no subsequence that converges in T\G if and 
only if there is a sequence {j n } of nonidentity elements of T, such 
that x~ j n x n — > e as n — > 00. 

[Hint: (<=) Contrapositive. If {x nk } C TC, where C is compact, then 
^n 1 7nX n is bounded away from e. (=>) Let 5 be a small open subset 
of G. By passing to a subsequence, we may assume [x m 5] (~l [a; n «S] = 0, 
for m ^ n. Since /i(F\G) < 00, then fj,([x n S]) / m(<S)i for some n. So 
the natural map x n S — > [i„5] is not injective. Hence, a:" 1 ^™ € 55 _1 
for some 7 G I\] 

#24. Prove the converse of Exer. 22. That is, if (x~ 1 Tx)f)S is a lattice 
in S, then the image [xS] of xS in T\G is closed (and has finite 
5-invariant volume). 

[Hint: Exer. 23 shows that the inclusion of ((a; _1 ra;) n S)\S into F\G 
is a proper map.] 

#25. Let C be the image of the embedding (1.1.18). Assuming that 
C is closed, show that there is an orbit of the u'-flow on 
SL(3,Z)\ SL(3,M) whose closure is C. 

#26. [Requires some familiarity with hyperbolic geometry] Let M be a 
compact, hyperbolic n- manifold, so M = T\H n , for some dis- 
crete group P of isometries of hyperbolic n-space TL n . For any 
k < n, there is a natural embedding TL k H n . Composing this 
with the covering map to M yields a C°° immersion / : Ti k — > M. 
Show that if fc 7^ 1, then there is a compact manifold TV and a 
C°° function i/> : N — > M, such that the closure f(H k ) is equal to 

#27. Let F = SL(2,Z) and G = SL(2,R). Use Ratner's Orbit Closure 
Theorem (and Rem. 1.1.19) to show, for each g e G, that T^r is 
either dense in G or discrete. 

[Hint: You may assume, without proof, the fact that if N is any con- 
nected subgroup of G that is normalized by F, then either N is trivial, 
or N = G. (This follows from the Borel Density Theorem (4.7.1).] 
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1.2. Margulis, Oppenheim, and quadratic forms 

Ratner's Theorems have important applications in number theory. 
In particular, the following result was a major motivating factor. It is 
often called the "Oppenheim Conjecture," but that terminology is no 
longer appropriate, because it was proved more than 15 years ago, by 
G. A. Margulis. Sec §1.4 for other (more recent) applications. 

(1.2.1) Definition. 

• A (real) quadratic form is a homogeneous polynomial of de- 
gree 2 (with real coefficients), in any number of variables. For 
example, 

Q(x, y, z, w) = x 2 — 2xy + \piyz — Aw 2 

is a quadratic form (in 4 variables). 

• A quadratic form Q is indefinite if Q takes both positive and 
negative values. For example, x 2 — 3xy + y 2 is indefinite, but 
x 2 — 2xy + y 2 is definite (see Exer. 2). 

• A quadratic form Q in n variables is nondegenerate if there does 
not exist a nonzero vector x £ R™, such that Q(v+x) = Q(v—x), 
for all v e R" (cf. Excr. 3). 

(1.2.2) Theorem (Margulis). LetQ be a real, indefinite, non- degenerate 
quadratic form in n > 3 variables. 

If Q is not a scalar multiple of a form with integer coefficients, then 
Q(Z n ) is dense in R. 

(1.2.3) Example. If Q{x,y,z) = x 2 - \f2xy + \/3z 2 , then Q is not 
a scalar multiple of a form with integer coefficients (see Exer. 4), so 
Margulis' Theorem tells us that <5(Z 3 ) is dense in R. That is, for each 
rel and e > 0, there exist a, b, c £ Z, such that \Q(a, b, c) — r\ < e. 

(1.2.4) Remark. 

1) The hypothesis that Q is indefinite is necessary. If, say, Q is 
positive definite, then Q(Z") C R-° is not dense in all of R. In 
fact, if Q is definite, then Q(Z n ) is discrete (see Exer. 7). 

2) There are counterexamples when Q has only two variables (see 
Exer. 8), so the assumption that there are at least 3 variables 
cannot be omitted in general. 

3) A quadratic form is degenerate if (and only if) a change of basis 
turns it into a form with less variables. Thus, the counterexamples 
of (2) show the assumption that Q is nondegenerate cannot be 
omitted in general (see Exer. 9). 
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4) The converse of Thm. 1.2.2 is true: if Q(Z n ) is dense in R, then 
Q cannot be a scalar multiple of a form with integer coefficients 
(see Exer. 10). 

Margulis' Theorem (1.2.2) can be related to Ratner's Theorem by 
considering the orthogonal group of the quadratic form Q. 

(1.2.5) Definition. 

1) If Q is a quadratic form in n variables, then SO(Q) is the or- 
thogonal group (or isometry group) of Q. That is, 

SO(Q) = {h€ SL(n, R) | Q(vh) = Q(v) for all v e R n } . 

(Actually, this is the special orthogonal group, because we are 
including only the matrices of determinant one.) 

2) As a special case, SO(m, n) is a shorthand for the orthogonal 
group SO(Q m , n ), where 

Qm,n{ x l, ■ ■ ■ i x m+n) x \ ~\~ ' ' ' ~\~ % m x m+l ' ' ' x m+n' 

3) Furthermore, we use SO(m) to denote SO(m, 0) (which is equal 
to SO(0,to)). 

(1.2.6) Definition. We use H° to denote the identity component 

of a subgroup H of SL(£, R); that is, H° is the connected component 
of H that contains the identity element e. It is a closed subgroup of H . 

Because SO(Q) is a real algebraic group (see 4.1.2(8)), Whitney's 
Theorem (4.1.3) implies that it has only finitely many components. 
(In fact, it has only one or two components (see Exers. 11 and 13).) 
Therefore, the difference between SO(Q) and SO(Q)° is very minor, so 
it may be ignored on a first reading. 

Proof of Margulis' Theorem on values of quadratic forms. Let 

• G = SL(3,R), 

• r= SL(3,Z), 

• Qo{x\,X2, x 3 ) = x\ + x\ — x%, and 
.ff = SO(Qo)° = SO(2,l)°. 

Let us assume Q has exactly three variables (this causes no loss of 
generality — see Exer. 15). Then, because Q is indefinite, the signature 
of Q is cither (2,1) or (1,2) (cf. Exer. 6); hence, after a change of 
coordinates, Q must be a scalar multiple of Qq; thus, there exist g € 
SL(3,R) and A e R x , such that 

Q = XQo o g. 

Note that SO(Q)° = gHg- 1 (see Exer. 14). Because H w SL(2,R) 
is generated by unipotent elements (see Exer. 16) and SL(3, Z) is a 
lattice in SL(3,R) (see 4.8.5), we can apply Ratner's Orbit Closure 
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Theorem (see 1.1.19). The conclusion is that there is a connected sub- 
group S of G, such that 

• H C S, 

• the closure of [gH] is equal to [gS], and 

• there is an S'-invariant probability measure on [gS] . 

Algebraic calculations show that the only closed, connected subgroups 
of G that contain H are the two obvious subgroups: G and H (see 
Excr. 17). Therefore, S must be either G or H. We consider each of 
these possibilities separately. 

Case 1. Assume S = G. This implies that 

TgH is dense in G. (1.2.7) 

We have 

Q(Z 3 ) = Qo{I?g) 

= Qo(z 3 r<7) 

= Q {1 3 TgH) 

Qo(Z 3 G) 
= Q (K 3 \ {0}) 
= M, 

where "~" means "is dense in. 

Case 2. Assume S = H. This is a degenerate case; we will show that 
Q is a scalar multiple of a form with integer coefficients. To keep the 
proof short, we will apply some of the theory of algebraic groups. The 
interested reader may consult Chapter 4 to fill in the gaps. 

Let T g = T n (gHg^ 1 ). Because the orbit [gH] = [gS] has finite 
if-invariant measure, we know that T g is a lattice in gHg^ 1 = SO(Q)°. 
So the Borel Density Theorem (4.7.1) implies SO(Q)° is contained 
in the Zariski closure of T g . Because T g C T = SL(3, Z), this im- 
plies that the (almost) algebraic group SO(Q)° is defined over Q (see 
Exer. 4.8#1). Therefore, up to a scalar multiple, Q has integer coeffi- 
cients (see Exer. 4.8#5). □ 

Exercises for §1.2. 

#1. Suppose a and [3 are nonzero real numbers, such that a/ (3 is 
irrational, and define L(x,y) = ax + [3y. Show L{1?) is dense 
in R. (Margulis' Theorem (1.2.2) is a generalization to quadratic 
forms of this rather trivial observation about linear forms.) 

#2. Let Qi(x,y) = x 2 -'ixy + y 2 and Q2(x,y) = x 2 -2xy + y 2 . Show 
(a) Qi(R 2 ) contains both positive and negative numbers, but 



(definition of g) 

(z 3 r = z 3 ) 

(definition of H) 

((1.2.7) and Q is continuous) 

(vG = K 3 \ {0} for v ^ 0) 
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(b) Q2(R 2 ) does not contain any negative numbers. 

#3. Suppose Q(xi, . . . , x n ) is a quadratic form, and let e„ = (0, . . . , 0, 1) 
be the n th standard basis vector. Show 

Q(v + e n ) = Q(v - e„) for all ueK" 

if and only if there is a quadratic form Q'(xi, . . . , x n -i) in n — 
1 variables, such that Q(x\, . . . , x n ) — Q'(x\, . . . , x n -i) for all 
xi, . . . ,x n e R. 

#4. Show that the form Q of Eg. 1.2.3 is not a scalar multiple of a 
form with integer coefficients; that is, there does not exist k G R x , 
such that all the coefficients of kQ are integers. 

#5. Suppose Q is a quadratic form in n variables. Define 

B: R™ xr^tby%wi) = l(Q(w + w) - Q(v - w)). 

(a) Show that B is a symmetric bilinear form on W 1 . That is, 
for v, vi, V2, w G K" and a G R, we have: 

(i) w) — B(w, v) 

(ii) B{v\ +v 2 ,w) = B(vi,w) + B(v2,w), and 

(iii) B(av,w) = aB(v,w). 

(b) For he SL(n,M), show/i G SO(Q) if and only if B(vh, wh) = 
B(v,w) for all v,w G K". 

(c) We say that the bilinear form B is nondegenerate if for 
every nonzero v G K", there is some nonzero w G R n , such 
that B(v,w) 7^ 0. Show that Q is nondegenerate if and only 
if B is nondegenerate. 

(d) For v G M™, let w- 1 = { w G M™ | to) = }. Show: 

(i) w- 1 is a subspace of R™, and 

(ii) if B is nondegenerate and v ^ 0, then R™ = Rw © t;- 1 . 

#6. (a) Show that Qk, n -k is a nondegenerate quadratic form (in 
n variables). 

(b) Show that Qk,n-k is indefinite if and only if i ^ {0,n}. 

(c) A subspace V of R" is totally isotropic for a quadratic 
form Q if Q(v) = for all v G V. Show that min(fc, n — k) 
is the maximum dimension of a totally isotropic subspace 
for Qk,n-k- 

(d) Let Q be a nondegenerate quadratic form in n variables. 
Show there exists a unique k G {0,1,..., n}, such that 
there is an invertible linear transformation T of R" with 
Q = Qk,n-k°T. We say that the signature of Q is (k, n—k). 
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[Hint: (6d) Choose s£l™ with Q(v) 0. By induction on n, the 
restriction of Q to « ± can be transformed to Qk> , n -i-k' ■] 

#7. Let Q be a real quadratic form in n variables. Show that if Q 
is positive definite (that is, if Q(R") > 0), then Q(Z n ) is a 
discrete subset of M. 

#8. Show: 

(a) If a is an irrational root of a quadratic polynomial (with 
integer coefficients), then there exists e > 0, such that 

\a-(p/o)\> e M, 
for all p, q G Z (with p, g # 0). 

[Hint: k(x — a) (x — /3) has integer coefficients, for some k G Z + 
and some /? € R \ {a}-] 

(b) The quadratic form Q(x,y) = x 2 — (3 + 2v / 2)y 2 is real, in- 
definite, and nondegenerate, and is not a scalar multiple of 
a form with integer coefficients. 

(c) Q(Z, Z) is not dense in M. 

[Hint: v3 + 2\/2 = 1 + \/2 is a root of a quadratic polynomial.] 

#9. Suppose Q(xi, x 2 ) is a real, indefinite quadratic form in two vari- 
ables, and that Q(x,y) is not a scalar multiple of a form with 
integer coefficients, and define Q*(yi, yi, Jte) = Q{v\-,Vi - J/3)- 

(a) Show that Q* is a real, indefinite quadratic form in two vari- 
ables, and that Q* is not a scalar multiple of a form with 
integer coefficients. 

(b) Show that if Q(Z 2 ) is not dense in M, then Q*(Z 3 ) is not 
dense in M. 

#10. Show that if Q(x\, . . . , x n ) is a quadratic form, and Q(Z n ) is 
dense in M, then Q is not a scalar multiple of a form with integer 
coefficients. 

#11. Show that SO(Q) is connected if Q is definite. 

[Hint: Induction on n. There is a natural embedding of SO(n — 1) in 
SO(n), such that the vector e n = (0, 0, . . . , 0, 1) is fixed by SO(n — 1). 
For n > 2, the map SO(n — l)g t— » e n g is a homeomorphism from 
80(7! - 1)\ SO(n) onto the (n - l)-sphere S" 1 " 1 .] 

#12. (Witt's Theorem) Suppose w G M m+ ™ with Q m , n {v) = Q m A w ) + 
0, and assume m + n > 2. Show there exists g G SO(m,n) with 

[Hint: There is a linear map T: u x — > w ± with Q m ,n(xT) = Q m,n yX J 
for all x (see Exer. 6). (Use the assumption m + n > 2 to arrange for 
g to have determinant 1, rather than — 1.)] 
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#13. Show that SO(m, n) has no more than two components if m, n > 
1. (In fact, although you do not need to prove this, it has exactly 
two components.) 

[Hint: Similar to Exor. 11. (Use Exer. 12.) If m > 1, then { v € K m+n | 
Q m ,n = 1 } is connected. The base case m = n = 1 should be done 
separately.] 

#14. In the notation of the proof of Thm. 1.2.2, show SO(Q)° = 
9Hg-\ 

#15. Suppose Q satisfies the hypotheses of Thm. 1.2.2. Show there 
exist Vi,V2,V3 € Z™, such that the quadratic form Q' on R 3 , de- 
fined by Q'(xi, X2, X3) = Q(xiv\ +X2V2 +^3^3), also satisfies the 
hypotheses of Thm. 1.2.2. 

[Hint: Choose any vi,V2 such that Q(vi)/Q(v2) is negative and irra- 
tional. Then choose v$ generically (so Q' is nondegenerate) .] 

#16. (Requires some Lie theory) Show: 

(a) The determinant function det is a quadratic form on sl(2, K) 
of signature (2, 1). 

(b) The adjoint representation Ad SL ( 2 .R) maps SL(2,K) into 
SO (det). 

(c) SL(2,R) is locally isomorphic to SO(2, 1)°. 

(d) SO(2, 1)° is generated by unipotent elements. 
#17. (Requires some Lie theory) 

(a) Show that SO (2, 1) is a maximal subalgebra of the Lie algebra 
fil(3,R). That is, there does not exist a subalgebra f) with 
30(2,1) C f) CS((3,K). 

(b) Conclude that if S is any closed, connected subgroup of 
SL(3,R) that contains SO (2, 1), then 

either S = SO(2, 1) or S = SL(3, R). 



is a nilpotent element of S0(2, 1), and the 



"0 11" 

[Hint: u= -1 
1 0_ 

kernel of ad sl ( 3 R ) m is only 2-dimensional. Since f) is a submodule of 
sl(3,R), the conclusion follows (see Exer. 4.9#7b).] 

1.3. Measure-theoretic versions of Ratner's Theorem 

For unipotent flows, Ratner's Orbit Closure Theorem (1.1.14) 
states that the closure of each orbit is a nice, geometric subset [xS] 
of the space X = T\G. This means that the orbit is dense in [xS]; in 
fact, it turns out to be uniformly distributed in [xS]. Before making 
a precise statement, let us look at a simple example. 
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(1.3.1) Example. As in Eg. 1.1.1, let ip t be the flow 

<Pt([x]) = [x + tv] 

on T™ defined by a vector v e W\ Let u be the Lebesgue measure 
on T", normalized to be a probability measure (so /i(T") = 1). 

1) Assume n — 2, so we may write v — (a,b). If a/b is irrational, 
then every orbit of ip t is dense in T 2 (see Exer. 1.1^3). In fact, 
every orbit is uniformly distributed in T 2 : if B is any nice open 
subset of T 2 (such as an open ball), then the amount of time that 
each orbit spends in B is proportional to the area of B. More 
precisely, for each ieT 2 , and letting A be the Lebesgue measure 
on R, we have 

X( { te[0,TU Mx) eB } )^ m agT ^ oo (132) 

(see Exer. 1). 

2) Equivalcntly, if 

• v = (a, b) with a/b irrational, 
i ,i £ T 2 , and 

• / is any continuous function on T 2 , 
then 

f n T f(<p t (x))dt f 
lim J ° ^ n - / fd» (1.3.3) 

(see Exer. 2). 

3) Suppose now that n — 3, and assume v — (a, 6,0), with a/b 
irrational. Then the orbits of tp t are not dense in T 3 , so they 
are not uniformly distributed in T 3 (with respect to the usual 
Lebesgue measure on T 3 ). Instead, each orbit is uniformly dis- 
tributed in some subtorus of T 3 : given x — (xi, X2, X3) € T, let 
fi 2 be the Haar measure on the horizontal 2-torus T 2 x {x 3 } that 
contains x. Then 

7f [ f{ft(x))dt^> [ fd^ 2 asT^oo 

1 JO JPx(x 3 } 

(see Exer. 3). 

4) In general, for any n and v, and any ieT", there is a subtorus S 
of T™, with Haar measure [is, such that 

i-T 



f f{tpt{x))dt^ f fdu s 
Jo Js 

as T — > 00 (see Exer. 4). 



The above example generalizes, in a natural way, to all unipotent 
flows: 
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(1.3.4) Theorem (Ratncr Equidistribution Theorem). If 

• G is any Lie group, 

• r is any lattice in G, and 

• (ft is any unipotent flow on T\G, 

then each (f t -orbit is uniformly distributed in its closure. 

(1.3.5) Remark. Here is a more precise statement of Thm. 1.3.4. For 
any fixed x G G, Ratner's Theorem (1.1.14) provides a connected, 
closed subgroup S of G (see 1.1.15), such that 

1) c S, 

2) the image [xS] of xS in T\G is closed, and has finite S'-invariant 
volume, and 

3) the y> t -orbit of [x] is dense in [xS]. 

Let (is be the (unique) S- invariant probability measure on [xS]. Then 
Thm. 1.3.4 asserts, for every continuous function / on T\G with com- 
pact support, that 

h ( f (Vt{x)) dt -> { fdfis asT^oo. 

1 JO J[xS] 

This theorem yields a classification of the (^-invariant probability 
measures. 

(1.3.6) Definition. Let 

• X be a metric space, 

• ip t be a continuous flow on X, and 

• (ibea measure on X. 
We say: 

1) |U is ipt -invariant if /z(v?t(A)) = /j,(A), for every Borel subset A 
of X, and every fef. 

2) fi is ergodic if [i is </? t -invariant, and every (^-invariant Borel 
function on X is essentially constant (w.r.t. (A function / is 
essentially constant on X if there is a set E of measure 0, such 
that / is constant onI\ E.) 

Results of Functional Analysis (such as Choquet's Theorem) imply 
that every invariant probability measure is a convex combination (or, 
more generally, a direct integral) of ergodic probability measures (see 
Exer. 6). (See §3.3 for more discussion of the relationship between arbi- 
trary measures and ergodic measures.) Thus, in order to understand all 
of the invariant measures, it suffices to classify the ergodic ones. Com- 
bining Thm. 1.3.4 with the Pointwise Ergodic Theorem (3.1.3) implies 
that these ergodic measures are of a nice geometric form (see Exer. 7): 
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(1.3.7) Corollary (Ratner Measure Classification Theorem). If 

• G is any Lie group, 

• r is any lattice in G, and 

• f t is any unipotent flow on T\G, 

then every ergodic f t -invariant probability measure on T\G is homoge- 
neous. 

That is, every ergodic f t -invariant probability measure is of the 
form lis, for some x and some subgroup S as in Rem. 1.3.5. 

A logical development (and the historical development) of the ma- 
terial proceeds in the opposite direction: instead of deriving Cor. 1.3.7 
from Thm. 1.3.4, the main goal of these lectures is to explain the main 
ideas in a direct proof of Cor. 1.3.7. Then Thms. 1.1.14 and 1.3.4 can 
be obtained as corollaries. As an illustrative example of this opposite 
direction — how knowledge of invariant measures can yield information 
about closures of orbits — let us prove the following classical fact. (A 
more complete discussion appears in Sect. 1.9.) 

(1.3.8) Definition. Let ft be a continuous flow on a metric space X. 

• ip t is minimal if every orbit is dense in X. 

• ip t is uniquely ergodic if there is a unique (^-invariant proba- 
bility measure on X. 

(1.3.9) Proposition. Suppose 

• G is any Lie group, 

• T is any lattice in G, such that T\G is compact, and 

• ft is any unipotent flow on T\G. 

If ft is uniquely ergodic, then f t is minimal. 

Proof. We prove the contrapositive: assuming that some orbit v?r(x) 
is not dense in T\G, we will show that the G-invariant measure uq is 
not the only <p t -invariant probability measure on T\G. 

Let n be the closure of fm.{x). Then f2 is a compact </? t -invariant 
subset of r\G (see Exer. 8), so there is a (/^-invariant probability mea- 
sure \i on Y\G that is supported on fi (see Exer. 9). Because 

supp/U C ft C T\G = supple, 

we know that a ^ no- Hence, there are (at least) two different <p t - 
invariant probability measures on T\G, so ft is not uniquely ergodic. 

□ 



(1.3.10) Remark. 
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1) There is no need to assume T is a lattice in Cor. 1.3.7 — the 
conclusion remains true when T is any closed subgroup of G. 
However, to avoid confusion, let us point out that this is not 
true of the Orbit Closure Theorem — there are counterexamples 
to (1.1.19) in some cases where T\G is not assumed to have finite 
volume. For example, a fractal orbit closure for a* on T\G yields 
a fractal orbit closure for Y on {a*}\G, even though the lattice T 
may be generated by unipotent elements. 

2) An appeal to "Ratner's Theorem" in the literature could be refer- 
ring to any of Ratner's three major theorems: her Orbit Closure 
Theorem (1.1.14), her Equidistribution Theorem (1.3.4), or her 
Measure Classification Theorem (1.3.7). 

3) There is not universal agreement on the names of these three ma- 
jor theorems of Ratner. For example, the Measure Classification 
Theorem is also known as "Ratner's Measure-Rigidity Theorem" 
or "Ratner's Theorem on Invariant Measures," and the Orbit Clo- 
sure Theorem is also known as the "topological version" of her 
theorem. 

4) Many authors (including M. Ratner) use the adjective algebraic, 
rather than homogeneous, to describe measures us as in (1.3.5). 
This is because fis is defined via an algebraic (or, more precisely, 
group-theoretic) construction. 

Exercises for §1.3. 

#1. Verify Eg. 1.3.1(1). 

[Hint: It may be easier to do Exer. 2 first. The characteristic function 
of B can be approximated by continuous functions.] 

#2. Verify Eg. 1.3.1(2); show that if a/b is irrational, and / is any 
continuous function on T 2 , then (1.3.3) holds. 
[Hint: Linear combinations of functions of the form 

f(x, y) = exp 2ir(mx + ny)i 

are dense in the space of continuous functions. 

Alternate solution: If To is sufficiently large, then, for every x € T 2 , the 
segment {ip t (x)}J^ comes within 8 of every point in T 2 (because T 2 is 
compact and abelian, and the orbits of ipt are dense). Therefore, the 
uniform continuity of / implies that if T is sufficiently large, then the 
value of (1/T) f^f(ip t (x))dt varies by less than e as x varies over T .] 

#3. Verify Eg. 1.3.1(3). 

#4. Verify Eg. 1.3.1(4). 

#5. Let 

• (ft be a continuous flow on a manifold X, 
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• Prob(X) ipt be the set of ^-invariant Borcl probability mea- 
sures on X, and 

• n e Prob(X) Vt . 

Show that the following are equivalent: 

(a) [i is ergodic; 

(b) every ^-invariant Borel subset of X is either null or conull; 

(c) fj, is an extreme point of Prob(X) Vt , that is, fi is not 
a convex combination of two other measures in the space 
Prob(A% t . 

[Hint: (5a=>5c) If /i = ai/Ui + 02/12, consider the Radon-Nikodym 
derivatives of /ii and /U2 (w.r.t. /i). (5c=>5b) If ^4 is any subset of X, 
then /1 is the sum of two measures, one supported on A, and the other 
supported on the complement of A.] 

#6. Choquet's Theorem states that if C is any compact subset of 
a Banach space, then each point in C is of the form J c ,cdfi(c), 
where v is a probability measure supported on the extreme points 
of C . Assuming this fact, show that every ^-invariant probability 
measure is an integral of ergodic (/j t -invariant measures. 

#7. Prove Cor. 1.3.7. 

[Hint: Use (1.3.4) and (3.1.3).] 

#8. Let 

• (fit be a continuous flow on a metric space X, 

• x e X, and 

• (fs.(x) — {(p t {x) I t G R} be the orbit of x. 

Show that the closure tps.(x) of </2r(x) is ip t -invariant; that is, 

#9. Let 

• (p t be a continuous flow on a metric space X, and 

• flbea nonempty, compact, ^-invariant subset of X. 

Show there is a ^-invariant probability measure /ionI, such 
that supp(/i) C tt. (In other words, the complement of fl is a null 
set, w.r.t. 

[Hint: Fix x £ fi. For each n € Z + , (1/n) f™f((p t (x))dt defines a 
probability measure fi n on X. The limit of any convergent subsequence 
is ^-invariant.] 

#10. Let 

• S 1 = M U {00} be the one-point compactification of R, and 

• <pt(x) = x + t for t e R and xeS 1 . 

Show <^t is a flow on S 1 that is uniquely ergodic (and continuous) 
but not minimal. 
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#11. Suppose tpt is a uniquely ergodic, continuous flow on a compact 
metric space X. Show ip t is minimal if and only if there is a ip t - 
invariant probability measure /ionl, such that the support of \x 
is all of X . 

#12. Show that the conclusion of Excr. 9 can fail if we omit the hy- 
pothesis that f2 is compact. 
[Hint: Let Q. = X = R, and define <fit(x) = x + t.\ 

1.4. Some applications of Ratner's Theorems 

This section briefly describes a few of the many results that rely 
crucially on Ratner's Theorems (or the methods behind them). Their 
proofs require substantial new ideas, so, although we will emphasize 
the role of Ratner's Theorems, we do not mean to imply that any of 
these theorems are merely corollaries. 

1.4A. Quantitative versions of Margulis' Theorem on val- 
ues of quadratic forms. As discussed in §1.2, G. A. Margulis proved, 
under appropriate hypotheses on the quadratic form Q, that the values 
of Q on Z™ are dense in K. By a more sophisticated argument, it can 
be shown (except in some small cases) that the values are uniformly 
distributed in K, not just dense: 

(1.4.1) Theorem. Suppose 

• Q is a real, nondegenerate quadratic form, 

• Q is not a scalar multiple of a form with integer coefficients, and 

• the signature (p, q) of Q satisfies p>3 and q>l. 
Then, for any interval (a, b) in R, we have 



as N — > oo. 



*{ 


v e u p+q 


a < Q(v) <b,\ 
\\v\\<N } 


vol < 


v e MP+i 


a < Q(v) <b,\ 
\\v\\ <N J 



(1.4.2) Remark. 

1) By calculating the appropriate volume, one finds a constant Cq, 
depending only on Q, such that, as N — > oo, 

# { v e V +q 3 < b ' | ~ (b - a )C Q NP+"- 2 . 

2) The restriction on the signature of Q cannot be eliminated; there 
are counterexamples of signature (2, 2) and (2, 1). 

Why Ratner's Theorem is relevant. We provide only an indication 
of the direction of attack, not an actual proof of the theorem. 

1) Let K = SO(p) x SO(g), so K is a compact subgroup of SO(p, q). 



1.4- Some applications of Ratner's Theorems 25 

2) For c, r G R, it is not difficult to sec that K is transitive on 

{v e W +q I Qp, q {v) = c, |H| = r} 

(unless q = 1, in which case -ftT has two orbits). 

3) Fix g G SL(p + q, R), such that Q = <3 P ,<j ° g. (Actually, Q may 
be a scalar multiple of Q Piq o g, but let us ignore this issue.) 

4) Fix a nontrivial one-parameter unipotent subgroup w* of SO(p, q). 

5) Let S be a bounded open set that 

• intersects Q pq (c), for all c e (a, b), and 

• does not contain any fixed points of u in its closure. 

By being a bit more careful in the choice of S and u l , we may 
arrange that is within a constant factor of t 2 for all w € S 

and all large (el. 

6) If v is any large element of W +q , with Q P , q (v) G (a, b), then there 
is some w € S, such that Q P: q(w) = Q p . q (v). If we choose t G R + 
with ||wu _t || = (note that t < C-\/||u||, for an appropriate 
constant C), then w G vKu 1 . Therefore 

fCy/M\ r 

/ / Xsivku^dkdt^O, (1.4.3) 

where \S is the characteristic function of S. 

7) We have 



v G Z p+ « 



a < < b, 
\\v\\ < N 



= < v G 



a < Q P .q{vg) < b, 
H<JV 

From (1.4.3), we see that the cardinality of the right-hand side 
can be approximated by 



,-cVn 



/■OVJV r 

/ / xsivgku^dkdt. 

, JO JK 

By 

• bringing the sum inside the integrals, and 

• defining xs ■ T\G -» R by 

where G = SL{p + q, R) and V = SL(p + q, \ 



26 1. Introduction to Ratner's Theorems 



we obtain 

/ / xs^gku^dkdt. (1.4.4) 
Jo Jk 

The outer integral is the type that can be calculated from Rat- 
ner's Equidistribution Theorem (1.3.4) (except that the integrand 
is not continuous and may not have compact support). 

(1.4.5) Remark. 

1) Because of technical issues, it is actually a more precise version 
(1.9.5) of equidistribution that is used to estimate the integral 
(1.4.4). In fact, the issues are so serious that the above argument 
actually yields only a lower bound on the integral. Obtaining the 
correct upper bound requires additional difficult arguments. 

2) Furthermore, the conclusion of Thm. 1.4.1 fails for some forms of 
signature (2,2) or (2, 1); the limit may be +oo. 

1.4B. Arithmetic Quantum Unique Ergodicity. Suppose T 
is a lattice in G = SL(2, R), such that T\G is compact. Then M = T\H 
is a compact manifold. (We should assume here that T has no elements 
of finite order.) The hyperbolic metric on TL yields a Ricmannian met- 
ric on M, and there is a corresponding Laplacian A and volume mea- 
sure vol (normalized to be a probability measure). Let 

= A < Ai < A 2 < • • • 

be the eigenvalues of A (with multiplicity). For each A„, there is a 
corresponding eigcnfunction <j> n , which we assume to be normalized 
(and real valued) , so that f M (\> 2 n d vol = 1 . 

In Quantum Mechanics, one may think of cj) n as a possible state of 
a particle in a certain system; if the particle is in this state, then the 
probability of finding it at any particular location on M is represented 
by the probability distribution d vol. It is natural to investigate the 
limit as A„ — > oo, for this describes the behavior that can be expected 
when there is enough energy that quantum effects can be ignored, and 
the laws of classical mechanics can be applied. 

It is conjectured that, in this classical limit, the particle becomes 
uniformly distributed: 

(1.4.6) Conjecture (Quantum Unique Ergodicity). 

lim </% d vol = d vol . 

n^oo 

This conjecture remains open, but it has been proved in an impor- 
tant special case. 



(1.4.7) Definition. 
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1) If T belongs to a certain family of lattices (constructed by a cer- 
tain method from an algebra of quaternions over Q) then we say 
that r is a congruence lattice. Although these are very special 
lattices, they arise very naturally in many applications in number 
theory and elsewhere. 

2) If the eigenvalue A„ is simple (i.e, if A n is not a repeated eigen- 
value), then the corresponding cigcnfunction <f> n is uniquely de- 
termined (up to a sign). If A„ is not simple, then there is an 
entire space of possibilities for <j> n , and this ambiguity results in 
a serious difficulty 

Under the assumption that T is a congruence lattice, it is pos- 
sible to define a particular orthonormal basis of each eigenspace; 
the elements of this basis are well defined (up to a sign) and 
are called Hecke eigenf unctions , (or Hecke-Maass cusp 
forms ) . 

We remark that if Y is a congruence lattice, and there are no repeated 
eigenvalues, then each <j> n is automatically a Hecke eigenfunction. 

(1.4.8) Theorem. If 

• T is a congruence lattice, and 

• each 4> n is a Hecke eigenfunction, 
then lirrin^oo <^>„ dvol = dvol. 

Why Ratner's Theorem is relevant. Let u be a limit of some subse- 
quence of 4>\ d vol. Then u can be lifted to an a'-invariant probability 
measure ju on T\G. Unfortunately a* is not unipotent, so Ratner's 
Theorem does not immediately apply. 

Because each <f> n is assumed to be a Hecke eigenfunction, one is 
able to further lift fi to a measure Jl on a certain homogeneous space 
r\ (G x SL(2, Q p )) , where Q p denotes the field of p-adic numbers for an 
appropriate prime p. There is an additional action coming from the fac- 
tor SL(2,Q p ). By combining this action with the "Shearing Property" 
of the u'-flow, much as in the proof of (1.6.10) below, one shows that Jl 
is u*-invariant. (This argument requires one to know that the entropy 
/i/i(o') is nonzero.) Then a version of Ratner's Theorem generalized to 
apply to p-adic groups implies that Jl is SL(2, M)-invariant. 

1.4C. Subgroups generated by lattices in opposite horo- 
spherical subgroups. 

(1.4.9) Notation. For 1 < k < £, let 

• Ufc^ = {g £ SL(£, R) | gij = 5ij if i > k or j < k }, and 

• V fe .£ = { g e SL(£, M) | gi j = S itj if j > k or i < k }. 
(We remark that is the transpose of U^.) 
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(1.4.10) Example. 



U 3> 5 = < 



1 

1 










1 





1 





> and V 3;5 = < 





1 



1 

* * * 










(1.4.11) Theorem. Suppose 

• Tu is a lattice in Wk,t> an d 

• Ty is a lattice in Vk,e, 

• the subgroup T — (IV, IV) is discrete, and 

• £>4. 

Then T is a lattice in SL(£, R). 

Why Ratner's Theorem is relevant. Let Uk,e be the space of lattices 
in Uk,e and Vk,£ be the space of lattices in Yk,i- (Actually, we consider 
only lattices with the same "covolume" as Tu or Ty, respectively.) The 
block-diagonal subgroup SL(/c,R) x SL(£ — k,R) normalizes Uk,e and 
Yk,e, so it acts by conjugation on Uk.i x Vk,e- There is a natural identi- 
fication of this with an action by translations on a homogeneous space 
of SL(H,R) x SL(M,R), so Ratner's Theorem implies that the clo- 
sure of the orbit of (IV, IV) is homogeneous (see 1.1.19). This means 
that there are very few possibilities for the closure. By combining this 
conclusion with the discreteness of T (and other ideas), one can estab- 
lish that the orbit itself is closed. This implies a certain compatibility 
between IV and IV, which leads to the desired conclusion. 

(1.4.12) Remark. For simplicity, we have stated only a very special 
case of the above theorem. More generally, one can replace SL(£, R) 
with another simple Lie group of real rank at least 2, and replace \Jk,t 
and Vk,e with a pair of opposite horosphcrical subgroups. The conclu- 
sion should be that T is a lattice in G, but this has only been proved 
under certain additional technical assumptions. 



1.4D. Other results. For the interested reader, we list some of 
the many additional publications that put Ratner's Theorems to good 
use in a variety of ways. 

• S. Adams: Containment docs not imply Borel reducibility, in: 
S. Thomas, ed., Set theory (Piscataway, NJ, 1999), pages 1-23. 
Amer. Math. Soc, Providence, RI, 2002. MR 2003j:03059 

• A. Borel and G. Prasad: Values of isotropic quadratic forms at 
S-integral points, Compositio Math. 83 (1992), no. 3, 347-372. 
MR 93j: 11022 
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• N. Elkies and C. T. McMullen: Gaps in y/n mod 1 and ergodic 
theory, Duke Math. J. 123 (2004), no. 1, 95-139. MR 2060024 

• A. Eskin, H. Masur, and M. Schmoll: Billiards in rectangles with 
barriers, Duke Math. J. 118 (2003), no. 3, 427-463. MR 2004c:37059 

• A. Eskin, S. Mozes, and N. Shah: Unipotcnt flows and count- 
ing lattice points on homogeneous varieties, Ann. of Math. 143 
(1996), no. 2, 253-299. MR 97d:22012 

• A. Gorodnik: Uniform distribution of orbits of lattices on spaces 
of frames, Duke Math. J. 122 (2004), no. 3, 549-589. MR 2057018 

• J. Marklof: Pair correlation densities of inhomogencous quadratic 
forms, Ann. of Math. 158 (2003), no. 2, 419-471. MR 2018926 

• T. L. Payne: Closures of totally geodesic immersions into locally 
symmetric spaces of noncompact type, Proc. Amer. Math. Soc. 
127 (1999), no. 3, 829-833. MR 99f:53050 

• V. Vatsal: Uniform distribution of Heegner points, Invent. Math. 
148 (2002), no. 1, 1-46. MR 2003j:11070 

• R. J. Zimmer: Superrigidity, Ratner's theorem, and fundamental 
groups, Israel J. Math. 74 (1991), no. 2-3, 199 -207. MR 93b:22019 

1.5. Polynomial divergence and shearing 

In this section, we illustrate some basic ideas that are used in Rat- 
ner's proof that ergodic measures are homogeneous (1.3. 7). This will be 
done by giving direct proofs of some statements that follow easily from 
her theorem. Our focus is on the group SL(2,R). 

(1.5.1) Notation. Throughout this section, 

• T and T' arc lattices in SL(2,R), 

• u l is the one-parameter unipotent subgroup of SL(2,R) defined 
in (1.1.5), 

• rjt is the corresponding unipotent flow on T\SL(2,K), and 

• v( t is the corresponding unipotent flow on Y'\ SL(2,R). 
Furthermore, to provide an easy source of counterexamples, 

• a* is the one-parameter diagonal subgroup of SL(2,R) defined in 
(1.1.5), 

• -ft is the corresponding geodesic flow on T\ SL(2, R), and 

• 7j is the corresponding geodesic flow on T'\ SL(2,R). 
For convenience, 

• we sometimes write X for T\ SL(2, R), and 

• we sometimes write X' for T'\ SL(2,R). 
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Let us begin by looking at one of Ratner's first major results in the 
subject of unipotent flows. 

(1.5.2) Example. Suppose T is conjugate to T' . That is, suppose 
there exists g e SL(2,K), such that T = g^T'g. 

Then rj t is measurably isomorphic to rj' t . That is, there is a 
(measure-preserving) bijection ip: T\SL(2,M) -> r'\SL(2,R), such 
that tp o ?7 t = r]' t o tp (a.e.). 

Namely, ip(Tx) — T'gx (see Exer. f). One may note that ip is 
continuous (in fact, C°°), not just measurable. 

The example shows that if T is conjugate to V , then r\ t is measur- 
ably isomorphic to rj' t . (Furthermore, the isomorphism is obvious, not 
some complicated measurable function.) Ratncr proved the converse. 
As we will see, this is now an easy consequence of Ratner's Measure 
Classification Theorem (1.3.7), but it was once an important theorem 
in its own right. 

(1.5.3) Corollary (Ratner Rigidity Theorem). If rjt is measurably iso- 
morphic to f]' t , then T is conjugate to V . 

This means that if rjt is measurably isomorphic to T]' t , then it is 
obvious that the two flows are isomorphic, and an isomorphism can be 
taken to be a nice, C°° map. This is a very special property of unipo- 
tent flows; in general, it is difficult to decide whether or not two flows 
are measurably isomorphic, and measurable isomorphisms are usually 
not C°°. For example, it can be shown that 7 t is always measurably 
isomorphic to (even if T is not conjugate to V), but there is usually 
no C°° isomorphism. (For the experts: this is because geodesic flows 
are Bernoulli.) 

(1.5.4) Remark. 

1) A version of Cor. 1.5.3 remains true with any Lie group G in the 
place of SL(2,R), and any (ergodic) unipotent flows. 

2) In contrast, the conclusion fails miserably for some subgroups 
that are not unipotent. For example, choose 

• any n, n' > 2, and 

• any lattices T and T' in G = SL(n,R) and G = SL(n',K), 
respectively. 

By embedding a* in the top left corner of G and G' , we obtain 
(ergodic) flows ip t and ip' t on T\G and T'\G', respectively. 

There is obviously no C°° isomorphism between ip t and ip' t , 
because the homogeneous spaces T\G and T'\G' do not have the 
same dimension (unless n = n'). Even so, it turns out that the 
two flows are measurably isomorphic (up to a change in speed; 
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that is, after replacing <p t with ip ct for some c e M x ). (For the 
experts: this is because the flows are Bernoulli.) 

Proof of Cor. 1.5.3. Suppose ip: (r]t,X) — > (ry^X') is a measurable 
isomorphism. Consider the graph of ip: 

graph(V') - { (x, ip(x)) | xeX}cXxX'. 

Because ip is measure preserving and equivariant, we see that the mea- 
sure uq on X pushes to an ergodic r\ t x r^-invariant measure /i x on 
X x X' (see Excr. 3). 

• Because r) t xr]' t is a unipotent flow (see Exer. 4), Ratner's Measure 
Classification Theorem (1.3.7) applies, so we conclude that the 
support of u x is a single orbit of a subgroup S of SL(2,M) x 
SL(2,R). 

• On the other hand, graph(^) is the support of u x - 

We conclude that the graph of ip is a single S-orbit (a.e.). This implies 
that ip is equal to an affine map (a.e.); that is, ip the composition of 
a group homomorphism and a translation (see Excr. 6). So ip is of a 
purely algebraic nature, not a terrible measurable map, and this implies 
that T is conjugate to V (see Exer. 7). □ 

We have seen that Cor. 1.5.3 is a consequence of Ratner's Theorem 
(1.3.7). It can also be proved directly, but the proof does not help to 
illustrate the ideas that are the main goal of this section, so we omit 
it. Instead, let us consider another consequence of Ratner's Theorem. 

(1.5.5) Definition. A flow (ip t , fi) is a quotient (or factor) of (r] t ,X) 
if there is a measure-preserving Borel function ip : X — > 0, such that 

ip o rj t = ipt o ip (a.e.). (1.5.6) 

For short, we may say ip is (essentially) equivariant if (1.5.6) holds. 

The function ip is not assumed to be injective. (Indeed, quotients 
are most interesting when ip collapses substantial portions of X to single 
points in 0.) On the other hand, ip must be essentially surjective (see 
Exer. 8). 

(1.5.7) Example. 

1) If T C T', then the horocycle flow (r]' t , X') is a quotient of (r]t,X) 
(see Exer. 9). 

2) For v e M™ and v' G R™ , let tp t and ip' t be the corresponding 
flows on T™ and T™' . If 

• n' <n, and 

• v'i = Vi for i = 1, . . . , n', 

then (ip' t ,X r ) is a quotient of (ip t ,X) (see Exer. 10). 
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3) The one-point space {*} is a quotient of any flow. This is the 
trivial quotient. 

(1.5.8) Remark. Suppose (<pt,ty is a quotient of (r] t ,X). Then there 
is a map ip: X — > O that is essentially equivariant. If we dehne 

x ~ y when ip(x) = ip(y), 

then ~ is an equivalence relation on X, and we may identify £1 with 
the quotient space X/~. 

For simplicity, let us assume ip is completely equivariant (not just 
a.e.). Then the equivalence relation ~ is ^-invariant; if x ~ y, then 
rjt(x) ~ r]t(y). Conversely, if = is an ?7 t -invariant (measurable) equiva- 
lence relation on X, then X/= is a quotient of (ipt,Sl). 

Ratner proved, for G = SL(2,M), that unipotent flows are closed 
under taking quotients: 

(1.5.9) Corollary (Ratner Quotients Theorem). Each nontrivial quo- 
tient of '(j]t,T\ SL(2, R)) is isomorphic to a unipotent flow (j]' t , T'\ SL(2, IF 
for some lattice V . 

One can derive this from Ratner's Measure Classification Theorem 
(1.3.7), by putting an (rj t x ?y t )-invariant probability measure on 

{(x,y)€XxX\i,(x)=i,(y)}. 

We omit the argument (it is similar to the proof of Cor. 1.5.3 (see 
Excr. 11)), because it is very instructive to see a direct proof that 
does not appeal to Ratner's Theorem. However, we will prove only the 
following weaker statement. (The proof of (1.5.9) can then be completed 
by applying Cor. 1.8.1 below (see Exer. 1.8#1).) 

(1.5.10) Definition. A Borel function ip: X — > O has finite fibers 

(a.e.) if there is a conull subset X n of X, such that ip^ 1 ^) n X n is 
finite, for all u> € fl. 

(1.5.11) Example. If T C T', then the natural quotient map ip: X — > 
X' (cf. 1.5.7(1)) has hnitc fibers (see Exer. 13). 

(1.5.12) Corollary (Ratner). If (t] U T\ SL(2, R)) -> (<p t ,Sl) is any quo- 
tient map (and £1 is nontrivial) , then ip has finite fibers (a.e.). 

In preparation for the direct proof of this result, let us develop some 
basic properties of unipotent flows that are also used in the proofs of 
Ratner's general theorems. 

Recall that 



and 

For convenience, let G = SL(2,R). 



1 
t 1 



e* 
e-* 
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FIGURE 1.5A. The 77 t -orbits of two nearby orbits. 



(1.5.13) Definition. If x and y are any two points of T\G, then there 
exists q e G, such that y = xq. If x is close to y (which we denote 
x w y), then g may be chosen close to the identity. Thus, we may 
define a metric d on Y\G by 

qeG, 



d(x,y) 



mm • 



Is -'II 



xq = y 



where 



I is the identity matrix, and 

|| • || is any (fixed) matrix norm on Mat 2X 2( 
may take 

bl 



For example, one 



a 

c d 



= max{|a|,|b|,|c|,|d|}. 



A crucial part of Ratner's method involves looking at what hap- 
pens to two nearby points as they move under the flow rj t . Thus, we 
consider two points x and xq, with q w I, and we wish to calculate 
d(t]t(x),r]t(xq)), or, in other words, 

d(xu l ,xqu l ) 

(see Fig. 1.5 A). 

• To get from x to xq, one multiplies by q; therefore, d(x, xq) = 

\\q-i\\- 

• To get from xu l to xqu*, one multiplies by u" t qu t ; therefore 

d{xu t ,xqu t ) = Wu^qu 1 - I\\ 

(as long as this is small — there are infinitely many elements g 
of G with xu l g — xqu 1 , and the distance is obtained by choosing 
the smallest one, which may not be u~ t qu t if t is large). 
Letting 

t \ a b " 

9 " [c d ' 

a simple matrix calculation (see Exer. 14) shows that 

a + bt b 
c - (a - d)t - bi 2 d - bt ' 



u qu - I = 



(1.5.14) 
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All the entries of this matrix are polynomials (in t), so we have 
following obvious conclusion: 

(1.5.15) Proposition (Polynomial divergence). Nearby points ofT\G 
move apart at polynomial speed. 

In contrast, nearby points of the geodesic flow move apart at ex- 
ponential speed: 

a be- 2 *! 



a t qa t - I = 



d 



(1.5.16) 



(see Exer. 1.5.16). Intuitively, one should think of polynomial speed as 
"slow" and exponential speed as "fast." Thus, 

• nearby points of a unipotent flow drift slowly apart, but 

• nearby points of the geodesic flow jump apart rather suddenly. 
More precisely, note that 

1) if a polynomial (of bounded degree) stays small for a certain 
length of time, then it must remain fairly small for a proportional 
length of time (see Exer. 17): 

• if the polynomial is small for a minute, then it must stay 
fairly small for another second (say); 

• if the polynomial is small for an hour, then it must stay fairly 
small for another minute; 

• if the polynomial is small for a year, then it must stay fairly 
small for another week; 

• if the polynomial is small for several thousand years, then it 
must stay fairly small for at least a few more decades; 

• if the polynomial has been small for an infinitely long time, 
then it must stay small forever (in fact, it is constant). 

2) In contrast, the exponential function e* is fairly small (< 1) in- 
finitely far into the past (for t < 0), but it becomes arbitrarily 
large in finite time. 

Thus, 

1) If two points of a unipotent flow stay close together 90% of the 
time, then they must stay fairly close together all of the time. 

2) In contrast, two points of a geodesic flow may stay close together 
90% of the time, but spend the remaining 10% of their lives 
wandering quite freely (and independently) around the manifold. 

The upshot is that if we can get good bounds on a unipotent flow most 
of the time, then we have bounds that are nearly as good all of the 
time: 

(1.5.17) Notation. For convenience, let x t — xu l and y t = yu* . 
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Figure 1.5B. Polynomial divergence: Two points that 
stay close together for a period of time of length £ must 
stay fairly close for an additional length of time e£ that 
is proportional to I. 

(1.5.18) Corollary. For any e > 0, there is a S > 0, such that if 
d(x t ,yt) < 6 for 90% of the times t in an interval [a,b], then d(x t ,yt) < 
e for all of the times t in the interval [a, b]. 

(1.5.19) Remark. Babysitting provides an analogy that illustrates this 
difference between unipotent flows and geodesic flows. 

1) A unipotent child is easy to watch over. If she sits quietly for an 
hour, then we may leave the room for a few minutes, knowing 
that she will not get into trouble while we are away. Before she 
leaves the room, she will start to make little motions, squirming 
in her chair. Eventually, as the motions grow, she may get out of 
the chair, but she will not go far for a while. It is only after giving 
many warning signs that she will start to walk slowly toward the 
door. 

2) A geodesic child, on the other hand, must be watched almost 
constantly. We can take our attention away for only a few seconds 
at a time, because, even if she has been sitting quietly in her chair 
all morning (or all week), the child might suddenly jump up and 
run out of the room while we are not looking, getting into all 
sorts of mischief. Then, before we notice she left, she might go 
back to her chair, and sit quietly again. We may have no idea 
there was anything amiss while we were not watching. 

Consider the RHS of Eq. 1.5.14, with a, b, c, and d very small. 
Indeed, let us say they are infinitesimal; too small to see. As t grows, 
it is the the bottom left corner that will be the first matrix entry to 
attain macroscopic size (see Exer. 18). Comparing with the definition 
of it* (see 1.1.5), we see that this is exactly the direction of the w*-orbit 
(see Fig. 1.5C). Thus: 

(1.5.20) Proposition (Shearing Property). The fastest relative motion 
between two nearby points is parallel to the orbits of the flow. 
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Figure 1.5C. Shearing: If two points start out so close 
together that we cannot tell them apart, then the first 
difference we see will be that one gets ahead of the 
other, but (apparently) following the same path. It 
is only much later that we will detect any difference 
between their paths. 

The only exception is that if q £ {u*}, then u^ t qu t = q for all t; in 
this case, the points xt and yt simply move along together at exactly 
the same speed, with no relative motion. 

(1.5.21) Corollary. If x and y are nearby points, then either 

1) there exists t > 0, such that y t w x t ±\, or 

2) y = x e , for some e rts 0. 

(1.5.22) Remark (Infinitesimals). Many theorems and proofs in these 
notes are presented in terms of infinitesimals. (We write x w y if the 
distance from x to y is infinitesimal.) There are two main reasons for 
this: 

1) Most importantly, these lectures are intended more to communi- 
cate ideas than to record rigorous proofs, and the terminology of 
infinitesimals is very good at that. It is helpful to begin by pre- 
tending that points arc infinitely close together, and see what will 
happen. If desired, the reader may bring in epsilons and deltas 
after attaining an intuitive understanding of the situation. 

2) Nonstandard Analysis is a theory that provides a rigorous foun- 
dation to infinitesimals — almost all of the infinitesimal proofs 
that are sketched here can easily be made rigorous in these terms. 
For those who are comfortable with it, the infinitesimal approach 
is often simpler than the classical notation, but we will provide 
non-infinitesimal versions of the main results in Chap. 5. 

(1.5.23) Remark. In contrast to the above discussion of tt*, 

• the matrix a* is diagonal, but 

• the largest entry in the RHS of Eq. 1.5.16 is an off-diagonal entry, 

so points in the geodesic flow move apart (at exponential speed) in a 
direction transverse to the orbits (see Fig. 1.5D). 

Let us now illustrate how to use the Shearing Property. 
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Figure 1.5D. Exponential divergence: when two 
points start out so close together that we cannot tell 
them apart, the first difference we see may be in a 
direction transverse to the orbits. 



Proof of Cor. 1.5.12. To bring the main ideas to the foreground, let 
us first consider a special case with some (rather drastic) simplifying 
assumptions. We will then explain that the assumptions are really not 
important to the argument. 

Al) Let us assume that X is compact (rather than merely having 
finite volume). 

A2) Because (<pt,ty is ergodic (see Exer. 16) and nontrivial, we know 
that the set of fixed points has measure zero; let us assume that 
(<Pt,£i) has no fixed points at all. Therefore, 

d((fi(u)),u)) is bounded away from 0, ,^ 
as lo ranges over 

(see Exer. 19). 

A3) Let us assume that the quotient map ip is uniformly continuous 
(rather than merely being measurable). This may seem unreason- 
able, but Lusin's Theorem (Exer. 21) tells us that ip is uniformly 
continuous on a set of measure 1 — e, so, as we shall see, this is 
actually not a major issue. 

Suppose some fiber tp~ 1 (uio) is infinite. (This will lead to a contradic- 
tion.) 

Because X is compact, the infinite set f/' _1 (^o) must have an ac- 
cumulation point. Thus, there exist x w y with ip(x) = tp(y). Because 
ip is equi variant, we have 

i>(x t ) = yj(y t ) for all t. (1.5.25) 

Flow along the orbits until the points x t and yt have diverged to a 
reasonable distance; say, d(x t ,yt) = 1, and let 

w = ^(|/t). (1-5.26) 

Then the Shearing Property implies (see 1.5.21) that 

yt-mixt). (1.5.27) 
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Therefore 



w = ip{yt) 

= ifi(lp(x t )) 



(1.5.26) 

((1.5.27) and -0 is uniformly continuous) 
(^ is equivariant) 
((1.5.25) and (1.5.26)). 



This contradicts (1.5.24). 

To complete the proof, we now indicate how to eliminate the as- 
sumptions (Al), (A2), and (A3). 

First, let us point out that (Al) was not necessary. The proof shows 
that %j) (u>) has no accumulation point (a.e.); thus, ■)/' _1 (uj) must be 
countable. Measure theorists can show that a countable-to-one equi- 
variant map between ergodic spaces with invariant probability mea- 
sure must actually be finite-to-one (a.e.) (see Exer. 3.3#3). Second, 
note that it suffices to show, for each e > 0, that there is a subset X 
of X, such that 

• n{X) > 1 — e and 

• ■0 _1 ('^) H X is countable, for a.e. we!!. 

Now, let Cl be the complement of the set of fixed points in fi. This 
is conull, so ?A~ 1 (r2) is conull in X. Thus, by Lusin's Theorem, tp~ 1 (Cl) 
contains a compact set K, such that 

• hg{K) > 0.99, and 

• tp is uniformly continuous on K. 

Instead of making assumptions (A2) and (A3), we work inside of K. 
Note that: 

(A2') d(ifi(u)),u)) is bounded away from 0, for u e tp(K); and 
(A3') tp is uniformly continuous on K. 

Let I be a generic set for K; that is, points in X spend 99% of their 
lives in K. The Pointwise Ergodic Theorem (3.1.3) tells us that the 
generic set is conull. (Technically, we need the points of X to be uni- 
formly generic: there is a constant L, independent of x, such that 



and this holds only on a set of measure 1 — e, but let us ignore this 



d{x t ,yt) = I- Unfortunately, it may not be the case that x t and y t 
are in K, but, because 99% of each orbit is in K, we may choose a 
nearby value t' (say, t < t' < l.li), such that 




detail.) Given x, y G X, with x 



y, flow along the orbits until 



x t > G K and yt' € K. 
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By polynomial divergence, we know that the y-orbit drifts slowly 
ahead of the x-orbit, so 

y t i w r]i + s(x t ') for some small 5. 

Thus, combining the above argument with a strengthened version 
of (A2') (see Exer. 22) shows that ip -1 ^) n X has no accumulation 
points (hence, is countable). This completes the proof. □ 

The following application of the Shearing Property is a better il- 
lustration of how it is actually used in the proof of Ratner's Theorem. 

(1.5.28) Definition. A self-joining of (r]t,X) is a probability mea- 
sure fi on X x X, such that 

1) fi is invariant under the diagonal flow rj t x rj t , and 

2) jj, projects to \xq on each factor of the product; that is, fi(AxY) = 
Ha{A) and fi{Y x B) = fx G {B). 

(1.5.29) Example. 

1) The product measure fi = [1q x /ig is a self-joining. 

2) There is a natural diagonal embedding x i— > (x 7 x) of X in X x X. 
This is clearly equivariant, so \ig pushes to an (r/ t x 7/ t )-invariant 
measure on X x X. It is a self-joining, called the diagonal self- 
joining. 

3) Replacing the identity map x i— > £ with covering maps yields a 
generalization of (2): For some g E G, let f = Tfl (g^Tg), and 
assume T' has finite index in T. There are two natural covering 
maps from X' to X: 

• ipi(T'x) = Tx, and 

• V 2 (r'x) =Tgx 

(see Exer. 23). Define ip: X' —> X x X by 

tp(x) = (ip 1 (x),ip 2 (x)). 

Then 

• ip is equivariant (because i/>i and '02 are equivariant), so the 
G-invariant measure fx' G on X' pushes to an invariant mea- 
sure fi — ip*[J.' G on X x X, defined by 

{i,M{A) = n' G {r\A)), 

and 

• fi is a self-joining (because and V'2 are measure preserv- 
ing). 

This is called a finite-cover self-joining. 

For unipotent flows on T\ SL(2,E), Ratner showed that these are 
the only product self-joinings. 
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Figure 1.5E. The diagonal self-joining and some 
other finite-cover self-joinings. 

(1.5.30) Corollary (Ratner's Joinings Theorem). Any ergodic self- 
joining of a horocycle flow must be either 

1) a finite cover, or 

2) the product self-joining. 

This follows quite easily from Ratner's Theorem (1.3.7) (see Exer. 24), 
but we give a direct proof of the following weaker statement. (Note that 
if the self-joining fx is a finite cover, then fx has finite fibers; that is, ll is 
supported on a set with only finitely many points from each horizontal 
or vertical line (see Exer. 25)). Corollary 1.8.1 will complete the proof 
of (1.5.30). 

(1.5.31) Corollary. If fx is an ergodic self-joining of n t , then either 

1) fx is the product joining, or 

2) fx has finite fibers. 

Proof. We omit some details (see Exer. 26 and Rem. 1.5.33). 

Consider two points (x, a) and (x, b) in the same vertical fiber. If 
the fiber is infinite (and X is compact), then we may assume a « b. By 
the Shearing Property (1.5.21), there is some t with a t ~ 171 (6*) - Let £ t 
be the vertical flow on X x X, defined by 

£t(x,y) = (x,7] t (y))- 

Then 

(x,a) t = (x t ,a t ) w (x t ,r]i(h)) = £1 ((£,&)*)• 
We now consider two cases. 

Case 1. Assume ft is ^-invariant. Then the ergodicity of rjt implies 
that fx is the product joining (see Exer. 27). 

Case 2. Assume jx is not ^-invariant. In other words, we have 
(£i)*(A) 7^ A- O n the other hand, (£i)*(/t) is 7y t -invariant (because 
£1 commutes with n t (see Exer. 28)). It is a general fact that any two 
ergodic measures for the same flow must be mutually singular (see 
Exer. 30), so (£i)*(/t) J- fx; thus, there is a conull subset X of X x X, 



1.5. Polynomial divergence and shearing 



41 




such that = 0. From this, it is not difficult to see that there 

is a compact subset K of X x X, such that 

fi{K) > 0.99 and d(K,^(K)) > (1.5.32) 

(see Exer. 31). 

To complete the proof, we show: 

Claim. Any generic set for K intersects each vertical fiber {x} x X in a 
countable set. Suppose not. (This will lead to a contradiction.) Because 
the fiber is uncountable, there exist (x, a) and (x, b) in the generic set, 
with a w b. Flow along the orbits until 

a t « f7i(&t), 

and assume (as is true 98% of the time) that (x,a) t and (x, b) t belong 
to K. Then 

K 3 (x,a) t = (x t ,a t ) w {x t ,T)i{h)) = € Ci(^), 

so = 0. This contradicts (1.5.32). □ 

(1.5.33) Remark. The above proof ignores an important technical 
point that also arose on p. 38 of the proof of Cor. 1.5.12: at the precise 
time t when a t ~ Vi{°t), it may not be the case that (x, a) t and (x, b) t 
belong to K . We choose a nearby value t' (say, t < t' < l.li), such that 
(x, a)f and (x, b)t> belong to K. By polynomial divergence, we know 
that a t < w r/i + a(6t') for some small (5. 

Hence, the final stage of the proof requires £i+s(K) to be disjoint 
from K. Since K must be chosen before we know the value of 5, this 
is a serious issue. Fortunately, a single set K can be chosen that works 
for all values of S (cf. 5.8.6). 
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Exercises for §1.5. 

#1. Suppose T and V arc lattices in G = SL(2,R). Show that if 
V = g^Tg, for some g e G, then the map ip: T\SL(2,R) -> 
r'\SL(2,R), defined by V>(Tx) = r'ga;, 

(a) is well defined, and 

(b) is equivariant; that is, ip o r\ t — Vt ° V'- 

#2. A nonempty, closed subset G of X x X is minimal for % x % 
if the orbit of every point in G is dense in G. Show that if G is a 
compact minimal set for r\ t x Vt, then C has finite fibers. 
[Hint: Use the proof of (1.5.31).] 

#3. Suppose 

• (X,/j.) and (X',[i') are Borel measure spaces, 

• (pt and ip' t are (measurable) flows on X and X', respectively, 

• ip: X ^> X' is & measure-preserving map, such that ipo(p t = 
ift o ip (a.e.), and 

• /i x is the Borel measure on X x X' that is defined by 

n x (ty = fi{x e x \ (x,tp(x)) e O }. 

Show: 

(a) /z x is (fit x ^-invariant. 

(b) If /z is ergodic (for <p t ), then fi x is ergodic (for <^ t x <^). 
#4. The product SL(2,R) x SL(2,R) has a natural embedding in 

SL(4,R) (as block diagonal matrices). Show that if u and v are 
unipotent matrices in SL(2, R), then the image of (u, v) is a unipo- 
tent matrix in SL(4, R). 

#5. Suppose tp is a function from a group G to a group H. Show ^ 
is a homomorphism if and only if the graph of tp is a subgroup of 
G x H. 

#6. Suppose 

• Gi and G 2 are groups, 

• Ti and T 2 are subgroups of G\ and G 2 , respectively, 

• ip is a function from Ti\Gi to r 2 \G 2 , 

• S is a subgroup of G\ x G 2 , and 

• the graph of ip is equal to xS, for some x £ Ti\Gi x T 2 \G 2 . 
Show: 

(a) If S n (e x G 2 ) is trivial, then ^ is an afflne map. 

(b) If -0 is surjective, and T 2 does not contain any nontrivial 
normal subgroup of G 2 , then S n (e x G 2 ) is trivial. 

[Hint: (6a) S is the graph of a homomorphism from Gi to G2 (see 
Exer. 5).] 
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#7. Suppose r and P' are lattices in a (simply connected) Lie group G. 

(a) Show that if there is a bijective affine map from T\G to T'\G, 
then there is an automorphism a of G, such that a(r) = P'. 

(b) Show that if a is an automorphism of SL(2,R), such that 
a(u) is conjugate to u, then a is an inner automorphism; 
that is, there is some g € SL(2,R), such that a(x) — g~ 1 xg 1 
for all x e SL(2,R). 

(c) Show that if there is a bijective affine map ip: T\G — ► r'\G, 
such that ip(xu) = i[)(x)u, for all x £ L\G, then T is conju- 
gate to r'. 

#8. Show that if tp: X — ► fl is a measure-preserving map, then VK-^O 

is a conull subset of 0. 
#9. Verify Eg. 1.5.7(1). 
#10. Verify Eg. 1.5.7(2). 

#11. Give a short proof of Cor. 1.5.9, by using Ratner's Measure Clas- 
sification Theorem. 

#12. Suppose r is a lattice in a Lie group G. Show that a subgroup T' 
of r is a lattice if and only if P' has finite index in V. 

#13. Suppose T and T' are lattices in a Lie group G, such that L C L'. 
Show that the natural map T\G — > L'\G has finite fibers. 

#14. Verify Eq. (1.5.14). 

#15. Verify Eq. (1.5.16). 

#16. Suppose (ip' t , fl') is a quotient of a flow (y? t , SI). Show that if is 
ergodic, then ip' is ergodic. 

#17. Given any natural number d, and any S > 0, show there is some 
e > 0, such that if 

• f(x) is any real polynomial of degree < d, 

• Ce R+, 

• [k, k + I] is any real interval, and 

• \ f(t)\ <CforalUe [k,k + l], 

then \f(t)\ < (1 + 5)C for all t e [fc, k + (1 + e)4 
#18. Given positive constants e < L, show there exists e > 0, such 
that if |a|, |b|, |c|, |d| < e , and N > 0, and we have 

|c- (a - d)t - bt 2 \ < L for all t G [0, V], 
then |a + bt| + |d - bf| < e for all t G [0, TV]. 

#19. Suppose "0 is a homeomorphism of a compact metric space (X, d), 
and that ip has no fixed points. Show there exists e > 0, such that, 
for all x G X, we have d(ip(x),x) > e. 

#20. (Probability measures are regular) Suppose 
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• X is a metric space that is separable and locally compact, 

• /j, is a Borel probability measure on X, 

• e > 0, and 

• A is a measurable subset of X. 
Show: 

(a) there exist a compact set C and an open set V, such that 
C C AcV and fj,(V \ C) < e, and 

(b) there is a continuous function / on X, such that 

fi{xeX\ XA {x)^f(x)}<e, 

where \A is the characteristic function of A. 
[Hint: Recall that "separable" means X has a countable, dense subset, 
and that "locally compact" means every point of X is contained in an 
open set with compact closure. (20a) Show the collection A of sets A 
such that C and V exist for every e is a u-algebra. (20b) Note that 

d(x,X x V) 
d(x,X \ V) +d(x,C) 
is a continuous function that is 1 on C and outside of V ] 

#21. (Lusin's Theorem) Suppose 

• X is a metric space that is separable and locally compact, 

• fi is a Borel probability measure on X, 

• e > 0, and 

• ip : X — > M is measurable. 

Show there is a continuous function / on I, such that 

(1 {xei|^)//W}< f . 

[Hint: Construct step functions that converge uniformly to ip on 
a set of measure 1 — (e/2). (Recall that a step function is a linear 
combination of characteristic functions of sets.) Now ip n is equal to 
a continuous function /„ on a set of measure 1 — 2~ n (cf. Exer. 20). 
Then {/„} converges to / uniformly on a set of measure > 1 — e.] 
#22. Suppose 

• (X, d) is a metric space, 

• n is a probability measure on X, 

• (ft is an ergodic, continuous, measure-preserving flow on X, 
and 

• e > 0. 
Show that either 

(a) some orbit of (fit has measure 1, or 

(b) there exist <5 > and a compact subset K of X, such that 

• [i{K ) > 1 — e and 
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• d(ip t (x),x) > S, for all t G (1 - 5, 1 + (5). 

#23. Show that the maps Vi and ip2 of Eg. 1.5.29(3) are well defined 
and continuous. 

#24. Derive Cor. 1.5.30 from Ratner's Measure Classification Theo- 
rem. 

#25. Show that if p, is a finite-cover self-joining, then there is a /2-conull 
subset fl of X x X, such that ({x} xX)nCt and (X x {x}) n ft 
are finite, for every .t G X. 

#26. Write a rigorous (direct) proof of Cor. 1.5.31, by choosing appro- 
priate conull subsets of X, and so forth. 

[Hint: You may assume (without proof) that there is a compact sub- 
set K of X x X, such that fj,(K) > 0.99 and K n £ S (A") = for all 
s G K with (£ s )./i / A (cf. 1.5.33).] 

#27. Verify that /t must be the product joining in Case 1 of the proof 
of Cor. 1.5.31. 

#28. Suppose 

• ipt is a (measurable) flow on a measure space X, 

• u is a ^-invariant probability measure on X, and 

• ip : X — > X is a Borel map that commutes with ip t . 
Show that f/^/i is (^-invariant. 

#29. Suppose \i and v are probability measures on a measure space X. 
Show v has a unique decomposition v = V\-\-v 2 , where z/i _L /i and 
^2 = /Mi f° r some / G L x {pi). (Recall that the notation fi\ _L u 2 
means the measures \i\ and ^2 are singular to each other; that 
is, some /11-conull set is /n 2 -null, and vice-versa.) 
[ffirci: The map (p 1 — * f (f>dfi is a linear functional on L?(X,\i + v), so 
it is represented by integration against a function t/> G L 2 (X,fi + v). 
Let i^i be the restriction of v to ^ _1 (0), and let / = (1 — ip)/ip.] 

#30. Suppose 

• (ft is a (measurable) flow on a space X, and 

• /Ui and a 2 are two different ergodic, </? t -invariant probability 
measures on X. 

Show that Ui and u 2 are singular to each other, 
[ffini: Exer. 29.] 

#31. Suppose 

• X is a locally compact, separable metric space, 

• a is a probability measure on X, and 

• tp : X — > X is a Borel map, such that ip*u and /x are singular 
to each other. 

Show: 
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(a) There is a conull subset ft of X, such that ip 1 (SY) is disjoint 
from fl. 

(b) For every e > 0, there is a compact subset K of X, such that 
fJ,(K) > 1 — e and ip^iK) is disjoint from K. 

1.6. The Shearing Property for larger groups 

If G is SL(3, R), or some other group larger than SL(2, R), then the 
Shearing Property is usually not true as stated in (1.5.20) or (1.5.21). 
This is because the centralizer of the subgroup u t is usually larger than 

{«*}■ 

(1.6.1) Example. If y = xq, with q g Cc(ut), then u^ t qu t = q for 
all £, so, contrary to (1.5.21), the points x and y move together, along 
parallel orbits; there is no relative motion at all. 

In a case where there is relative motion (that is, when q (fc Cg («*)), 
the fastest relative motion will usually not be along the orbits of u*, 
but, rather, along some other direction in the centralizer of tt*. (We 
saw an example of this in the proof that self-joinings have finite fibers 
(see Cor. 1.5.31): under the unipotent flow rj t x rj t , the points {x, a) and 
(y, b) move apart in the direction of the flow £ t , not r\ t x rj t .) 

(1.6.2) Proposition (Generalized Shearing Property). The fastest rel- 
ative motion between two nearby points is along some direction in the 
centralizer of u* . 

More precisely, if 

• {u*} is a unipotent one-parameter subgroup of G, and 

• x and y are nearby points in Y\G, 
then either 

1) there exists t > and c g Cg(u'), such that 

(a) |jc|| = 1, and 

(b) xu* w yu l c, 
or 

2) there exists c g Cg(u*), with c ~ I, such that y = xc. 

Proof (Requires some Lie theory). Write y = xq, with q w J. It is 
easiest to work with exponential coordinates in the Lie algebra; for 
g g G (with g near /), let g be the (unique) small element of Q with 
expg = g. In particular, choose 

• u g with exp(iu) = u*, and 

• q g with cxp q = q. 
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Then 

u~ t qu t = q(Adu t ) = q exp(ad(iw)) 

= q + q(adu)t + ±g(ad w) 2 i 2 + ±q{&du) 3 t 3 H . 

For large t, the largest term is the one with the highest power of t; that 
is, the last nonzero term q(a,du) k . Then 

[q(sidu) k ,u} = (q(&du) k )(&du) = q(a.du) k+1 = 

(because the next term does not appear), so q(&du) k is in the centralizer 
ofV. ~ □ 

The above proposition shows that the direction of fastest relative 
motion is along the centralizer of w*. This direction may or may not 
belong to {u 1 } itself. In the proof of Ratner's Theorem, it turns out 
that we wish to ignore motion along the orbits, and consider, instead, 
only the component of the relative motion that is transverse (or per- 
pendicular) to the orbits of the flow. This direction, by definition, does 
not belong to {u 1 }. It may or may not belong to the centralizer of {u*}. 



(1.6.3) Example. Assume G = SL(2, R), and suppose x and y are two 
points in T\G with x w y. Then, by continuity, x t ~ yt for a long 
time. Eventually, we will be able to see a difference between x t and y t . 
The Shearing Property (1.5.20) tells us that, when this first happens, 
Xt will be indistinguishable from some point on the orbit of y; that is, 
Xt ~ yt' for some t'. This will continue for another long time (with t' 
some function of t), but we can expect that Xt will eventually diverge 
from the orbit of y — this is transverse divergence. (Note that this 
transverse divergence is a second-order effect; it is only apparent after 
we mod out the relative motion along the orbit.) Letting y t > be the point 
on the orbit of y that is closest to Xt, we write Xt = yt>g for some g £ G. 
Then g — I represents the transverse divergence. When this transverse 
divergence first becomes macroscopic, we wish to understand which of 
the matrix entries of g — I are macroscopic. 

In the matrix on the RHS of Eq. (1.5.14), we have already ob- 
served that the largest entry is in the bottom left corner, the direction 
of {it*}. If we ignore that entry, then the two diagonal entries are the 
largest entries. The diagonal corresponds to the subgroup {a*} (or, in 
geometric terms, to the geodesic flow j t ). Thus, the fastest transverse 
divergence is in the direction of {a*}. Notice that {a*} normalizes {«*} 
(see Exer. 1.1#9). 

(1.6.4) Proposition. The fastest transverse motion is along some 
direction in the normalizer of u* . 
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Proof. In the calculations of the proof of Prop. 1.6.2, any term that 
belongs to U represents motion along {u*}. Thus, the fastest transverse 
motion is represented by the last term q(&du) k that is not in U. Then 
q(adu) fc+1 G U, or, in other words, 

[g(adu) fc ,u] e U. 

Therefore q(a,du) k normalizes U. □ 

By combining this observation with ideas from the proof that join- 
ings have finite fibers (see Cor. 1.5.31), we see that the fastest transverse 
divergence is almost always in the direction of StabG (/•*), the subgroup 
consisting of elements of G that preserve /U. More precisely: 

(1.6.5) Corollary. There is a conull subset X' of X , such that, for 
all x, y € X' , with x w y, the fastest transverse motion is along some 
direction in Stabc(M)- 

Proof. Because the fastest transverse motion is along the normalizer, 
we know that 

yu l i=a xu c, 

for some t, t' € ffi and c e JV G (ti*). 

Suppose c ^ Stabc(/i). Then, as in the proof of (1.5.31), we may 
assume xu 1 ,yu l G K, where if is a large compact set, such that 
K n Kc = 0. (Note that t' is used, instead of t, in order to elimi- 
nate relative motion along the {u*}-orbit.) We have d(K,Kc) > 0, and 
this contradicts the fact that xu l c w yu* . □ 



(1.6.6) Remark. We note an important difference between the preced- 
ing two results: 

1) Proposition 1.6.4 is purely algebraic, and applies to all x, y e T\G 
with x w y. 

2) Corollary 1.6.5 depends on the measure \i — it applies only on a 
conull subset of T\G. 

We have considered only the case of a one-parameter subgroup {«'}, 
but, for the proof of Ratner's Theorem in general, it is important to 
know that the analogue of Prop. 1.6.4 is also true for actions of larger 
unipotent subgroups U : 

the fastest transverse motion is along fl 6 71 

some direction in the normalizer of U. 

To make a more precise statement, consider two points x, y e X, with 
x w y. When we apply larger and larger elements u of U to x, we will 
eventually reach a point where we can see that xu £ yU. When we first 
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reach this stage, there will be an element c of the normalizer Nq(U), 
such that xuc appears to be in yll; that is, 

xuc rts yvl , for some vl G U. (1.6.8) 

This implies that the analogue of Cor. 1.6.5 is also true: 

(1.6.9) Corollary. There is a conull subset X' of X , such that, for all 
X' , with x w y, the fastest transverse motion to the U -orbits is 

along some direction in Stabc(M)- 

To illustrate the importance of these results, let us prove the fol- 
lowing special case of Ratner's Measure Classification Theorem. It is a 
major step forward. It shows, for example, that if \i is not supported 
on a single w*-orbit, then there must be other translations in G that 
preserve /x. 

(1.6.10) Proposition. Let 

• T be a lattice in a Lie group G, 

• u* be a unipotent one-parameter subgroup of G, and 

• fi be an ergodic u l -invariant probability measure on T\G. 

IfU = Stabc(^)° is unipotent, then \x is supported on a single U -orbit. 

Proof. This is similar to the proof that joinings have finite fibers (see 
Cor. 1.5.31). We ignore some details (these may be taken to be exer- 
cises for the reader) . For example, let us ignore the distinction between 
Stabc(M) an d its identity component Stabc(^)° (see Exer. 1). 

By ergodicity, it suffices to find a [/-orbit of positive measure, so 
let us suppose all [/-orbits have measure 0. Actually, let us make the 
stronger assumption that all iVG(J7)-orbits have measure 0. This will 
lead to a contradiction, so we can conclude that fi is supported on an 
orbit of Nq(U). It is easy to finish from there (see Exer. 3). 

By our assumption of the preceding paragraph, for almost every 
x e r\G, there exists y w x, such that 

• y ^ xN G (U), and 

• y is in the support of fi. 

Because y £ x Nc{U), the [/-orbit of y has nontrivial transverse diver- 
gence from the [/-orbit of x (see Exer. 4), so 

yu' w xuc, 

for some u,u' £ U and c £ U. From Cor. 1.6.9, we know that c e 
Stabc(^). This contradicts the fact that U = Stabc(M). □ 
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Exercises for §1.6. 

#1. The proof we gave of Prop. 1.6.10 assumes that Stabg^) is 
unipotent. Correct the proof to use only the weaker assumption 
that Stabc(^)° is unipotent. 

#2. Suppose 

• T is a closed subgroup of a Lie group G, 

• U is a unipotent, normal subgroup of G, and 

• fi is an ergodic [/-invariant probability measure on T\G. 

Show that fi is supported on a single orbit of Stabc (m) ■ 
[Hint: For each g € Ng{U), such that g ^ Stabc(/i), there is a conull 
subset Q of T\G, such that SlngQ. = (see Exers. 1.5#30 and 1.5#31). 
You may assume, without proof, that this set can be chosen indepen- 
dent of g: there is a conull subset f2 of T\G, such that if g € Ng{U) 
and g Stab G (^), then Q n gfl = 0. (This will be proved in (5.8.6).)] 

#3. Suppose 

• T is a lattice in a Lie group G, 

• /j, is a {/-invariant probability measure on T\G, and 

• /j, is supported on a single Nq(U) -orbit. 
Show that is supported on a single [/-orbit. 

[Hint: Reduce to the case where Ng(U) = G, and use Exer. 2.] 
#4. In the situation of Prop. 1.6.10, show that if x,y G T\G, and 
y xNc(U), then the [/-orbit of y has nontrivial transverse 
divergence from the [/-orbit of x. 

1.7. Entropy and a proof for G = SL(2,M) 

The Shearing Property (and consequences such as (1.6.10)) are an 
important part of the proof of Ratner's Theorems, but there are two 
additional ingredients. We discuss the role of entropy in this section. 
The other ingredient, exploiting the direction of transverse divergence, 
is the topic of the following section. 

To illustrate, let us prove Ratner's Measure Classification Theorem 
(1.3.7) for the case G = SL(2,R): 

(1.7.1) Theorem. If 

• G = SL(2,R), 

• r is any lattice in G, and 

• % is the usual unipotent flow on T\G, corresponding to the unipo- 
tent one-parameter subgroup u l 



1 

t 1 



(see 1.1.5), 



then every ergodic rjt-invariant probability measure on T\G is homoge- 
neous. 
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Proof. Let 

• [i be any ergodic ^-invariant probability measure on T\G, and 

• S = Stab G (/u). 

We wish to show that \x is supported on a single S'-orbit. 

Because [i is ^-invariant, we know that {u*} C S. We may as- 
sume {«*} 7^ S°. (Otherwise, it is obvious that 5° is unipotent, 
so Prop. 1.6.10 applies.) Therefore, S° contains the diagonal one- 
parameter subgroup 

V 




a = 



(see Exer. 2). To complete the proof, we will show S also contains the 
opposite unipotent subgroup 



1 r 
1 



(1.7.2) 



Because {«'}, {a s }, and {v r } } taken together, generate all of G, this 
implies S = G, so \i must be the G-invariant (Haar) measure on T\G, 
which is obviously homogeneous. 

Because {a s } c S, we know that a 8 preserves /z. Instead of contin- 
uing to exploit dynamical properties of the unipotent subgroup {u*}, 
we complete the proof by working with {a s }. 

Let 7 S be the flow corresponding to a s (see Notn. 1.1.5). The map 
7s is not an isometry: 

• 7 S multiplies infinitesimal distances in u*-orbits by e 2s , 

• 7s multiplies infinitesimal distances in w r -orbits by e~ 2s , and 

• 7s does act as an isometry on a s -orbits; it multiplies infinitesimal 
distances along a s -orbits by 1 

(see Exer. 3). The map 7 S is volume preserving because these factors 
cancel exactly: e 2s • e~ 2s ■ 1 = 1. 

The fact that 7 S preserves the usual volume form on T\G led to 
the equation e 2s ■ e~ 2s -1 = 1. Let us find the analogous conclusion that 
results from the fact that 7 S preserves the measure fi: 

• Because {a s } normalizes {u 1 } (see Exer. 1.1#9), 

B = {a s u t | s,t G R} 

is a subgroup of G. 

• Choose a small (2-dimensional) disk D in some _B-orbit. 

• For some (fixed) small e > 0, and each d € D, let Bd = { dv r \ 
< r < e}. 

• LetB = \J deD B d . 
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• Then B is the disjoint union of the fibers {Bd}deD, so the restric- 
tion /x|g can be decomposed as an integral of probability measures 
on the fibers: 



where Vd is a probability measure on Bd (see 3.3.4). 

• The map j s multiplies areas in D by e 2s ■ 1 = e 2s . 

• Then, because fi is 7 s -invariant, the contraction along the fibers 
Bd must exactly cancel this: for I C 8^, we have 



The conclusion is that the fiber measures fid scale exactly like the 
Lebesgue measure on [0,e]. This implies, for example, that fid can- 
not be a point mass. In fact, one can use this conclusion to show that 
fid must be precisely the Lebesgue measure. (From this, it follows im- 
mediately that fi is the Haar measure on T\G.) As will be explained 
below, the concept of entropy provides a convenient means to formalize 
the argument. □ 

(1.7.3) Notation. As will be explained in Chap. 2, one can define the 
entropy of any measure-preserving transformation on any measure 
space. (Roughly speaking, it is a number that describes how quickly 
orbits of the transformation diverge from each other.) For any g € G 
and any g-invariant probability measure fi on T\G, let h^(g) denote 
the entropy of the translation by g. 

A general lemma relates entropy to the rates at which the flow 
expands the volume form on certain transverse foliations (see 2.5.11'). 
In the special case of a s in SL(2,M), it can be stated as follows. 

(1.7.4) Lemma. Suppose fi is an a s -invariant probability measure on 
r\SL(2,R). 

We have h^(a s ) < 2\s\, with equality if and only if fi is {u 1 }- 
invariant. 

We also have the following general fact (see Exer. 2.3#7): 

(1.7.5) Lemma. The entropy of any invertible measure-preserving 
transformation is equal to the entropy of its inverse. 

Combining these two facts yields the following conclusion, which 
completes the proof of Thm. 1.7.1. 

(1.7.6) Corollary. Let fi be an ergodic {w*} -invariant probability mea- 
sure on T\ SL(2,R). 

If fi is {a 8 } -invariant, then fi is SL(2, R)-invariant. 




fi lB{d) { ls {X))=e- 2s fjL d {X). 
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Proof. From the equality clause of Lem. 1.7.4, we have h^a 8 ) = 2\s\, 
so Lem. 1.7.5 asserts that /i M (a~ s ) = 2|s|. 

On the other hand, there is an automorphism of SL(2, R) that maps 
a s to a~ s , and interchanges {u*} with {v r }. Thus Lem. 1.7.4 implies: 

Ma-*) <2|*|, 
with equality if and only if fi is {w r }-invariant. 

Combining this with the conclusion of the preceding paragraph, we 
conclude that fj, is {w r }-invariant. 

Because v r , a s , and u*, taken together, generate the entire SL(2, R), 
we conclude that \i is SL(2, R)-invariant. □ 



Exercises for §1.7 

lac 

#1. Let T 



with a, b 7^ 0. Show that if V is a vec- 



1 b 
1 

tor subspace of R 3 , such that T(V) C V and dim V > 1, then 
{(0,*,0)}c V. 

#2. [Requires some Lie theory] Show that if H is a connected sub- 
group of SL(2,R) that contains {u 1 } as a proper subgroup, then 
{a 8 } CH. 

[Hint: The Lie algebra of H must be invariant under AdG For the 
appropriate basis of the Lie algebra s[(2,R), the desired conclusion 
follows from Exer. 1.] 
#3. Show: 

(a) 7 S (xu*) = 7 S (x) u e2s \ 

(b) 7s(a;w t ) = 7s( a; ) we and 

(c) 7s (xa*) = -fs(x) a*. 

1.8. Direction of divergence and a joinings proof 

In §1.5, we proved only a weak form of the Joinings Theorem 
(1.5.30). To complete the proof of (1.5.30) and, more importantly, to 
illustrate another important ingredient of Ratner's proof, we provide a 
direct proof of the following fact: 

(1.8.1) Corollary (Ratncr). If 

• fi is an ergodic self- joining ofn t , and 

• fi has finite fibers, 
then ft is a finite cover. 

(1.8.2) Notation. We fix some notation for the duration of this section. 
Let 

• T be a lattice in G = SL(2, R), 
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• x = r\G, 

• U = {u*}, 

• A = {a s }, 

• V={v r }, 

• ~: G — > G x G be the natural diagonal embedding. 

At a certain point in the proof of Ratner's Measure Classification 
Theorem, we will know, for certain points x and y = xg, that the 
direction of fastest transverse divergence of the orbits belongs to a 
certain subgroup. This leads to a restriction on g. In the setting of 
Cor. 1.8.1, this crucial observation amounts to the following lemma. 

(1.8.3) Lemma. Let x,y e X x X. If 

• yE x(V x V), and 

• the direction of fastest transverse divergence of the U -orbits of x 
and y belongs to A, 

then y G xV . 

Proof. We have y — xv for some v e V x V. Write x — (xi,x 2 ), 
y = (y 1,2/2) and v = (vi,v 2 ) = (v ri ,v r2 ). To determine the direction of 
fastest transverse divergence, we calculate 

u -t vu t _ (7 ; 7) _ - J 5 u~ t v 2 u t - I) 



nt 

-rit 2 -nt 



r 2 t 

-r 2 i 2 -r 2 t 



(cf. 1.5.14). By assumption, the largest terms of the two components 
must be (essentially) equal, so we conclude that n = r 2 . Therefore 
v e V, as desired. □ 

Also, as in the preceding section, the proof of Cor. 1.8.1 relies on 
the relation of entropy to the rates at which a flow expands the volume 
form on transverse foliations. For the case of interest to us here, the 
general lemma (2.5.11') can be stated as follows. 

(1.8.4) Lemma. Let 

• fi be an a s -invariant probability measure on X x X , and 

• V be a connected subgroup of V x V . 
Then: 

1) If fi is V -invariant, then hf,(a s ) > 2|s|dimV". 

2) If there is a conull, Borel subset of X x X , such that Qr\x(V x 
V) C xV, for every x £ O, then hf l (a s ) < 2\s\ dimV". 
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3) // the hypotheses of (2) are satisfied, and equality holds in its 
conclusion, then ft is V -invariant. 

Proof of Cor. 1.8.1. We will show that ft is G-invariant, and is 
supported on a single G-orbit. (Actually, we will first replace G by a 
conjugate subgroup.) Then it is easy to see that fi is a finite cover (see 
Exer. 2). 

It is obvious that fi is not supported on a single [/-orbit (because ft 
must project to the Haar measure on each factor of X x X), so, by com- 
bining (1.6.7) with (1.6.9) (and Exer. 1.6#3), we see that Stab GxG (/t) 
must contain a connected subgroup of N GxG (U) that is not contained 
in U. (Note that N GxG (U) = A x (U x U) (see Exer. 3).) Using the 
fact that ft has finite fibers, we conclude that Stab GxG (/2) contains a 
conjugate of A (see Exers. 4 and 5). Let us assume, without loss of 
generality, that A C Stab GxG (/t) (see Exer. 6); then 

N GxG (U) n Stab G x G (A) =AU (1.8.5) 

(see Exer. 7). Combining (1.6.7), (1.6.9), and (1.8.5) yields a conull 
subset (X x Xy of X x X, such that if x,y € (X x X)' (with x w 
y), then the direction of fastest transverse divergence between the U- 
orbits of x and y is an element of AU. Thus, Lem. 1.8.3 implies that 
(XxX)'r\x(VxV) C xV, so an entropy argument, based on Lem. 1.8.4, 
shows that 

fi is U-invariant (1.8.6) 

(see Exer. 8). 

Because U, A, and V, taken together, generate G, we conclude that 
ft is G-invariant. Then, because fi has finite fibers (and is ergodic), it 
is easy to see that fi is supported on a single G-orbit (see Exer. 9). □ 

Exercises for §1.8. 

#1. Obtain Cor. 1.5.9 by combining Cors. 1.5.12 and 1.8.1. 

#2. In the notation of (1.8.1) and (1.8.2), show that if fi is G-invariant, 
and is supported on a single G-orbit in X x X, then fi is a finite- 
cover joining. 

[Hint: The G-orbit supporting ft can be identified with T'\G, for some 
lattice r' in G] 

#3. In the notation of (1.8.2), show that N GxG (U) = Ax (U xU). 

#4. In the notation of (1.8.1), show that if ft has finite fibers, then 
Stab GxG (/i) n (G x {e}) is trivial. 

#5. In the notation of (1.8.2), show that if H is a connected subgroup 
of A x (U x U), such that 
• H <£_ U x U, and 
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• H n (G x {e}) and H n ({e} x G) are trivial, 
then H contains a conjugate of A. 

#6. Suppose 

• T is a lattice in a Lie group G, 

• fj, is a measure on T\G, and 

• g€G. 

Show StabG(ff*M) = 5 _1 Stabain) g- 
#7. In the notation of (1.8.2), show that if H is a subgroup of (^4 x 
A) k (U x U), such that 

• All C i?, and 

• H n (G x {e}) and iJ n ({e} x G) are trivial, 

then H = AU. 
#8. Establish (1.8.6). 

#9. In the notation of (1.8.1) and (1.8.2), show that if fi is G-invariant, 
then fi is supported on a single G-orbit. 

[Hint: First show that ft is supported on a finite union of G-orbits, 
and then use the fact that ft is ergodic.] 

1.9. From measures to orbit closures 

In this section, we sketch the main ideas used to derive Ratner's 
Orbit Closure Theorem (1.1.14) from her Measure Classification Theo- 
rem (1.3.7). This is a generalization of (1.3.9), and is proved along the 
same lines. Instead of establishing only (1.1.14), the proof yields the 
much stronger Equidistribution Theorem (1.3.5). 

Proof of the Ratner Equidistribution Theorem. To simplify mat- 
ters, let us 

A) assume that T\G is compact, and 

B) ignore the fact that not all measures are ergodic. 

Remarks 1.9.1 and 1.9.3 indicate how to modify the proof to eliminate 
these assumptions. 

Fix x € G. By passing to a subgroup of G, we may assume 

C) there does not exist any connected, closed, proper subgroup S 
of G, such that 

(a) {w*} teR C 5, 

(b) the image [xS] of xS in T\G is closed, and has finite S- 
invariant volume. 

We wish to show that the u* -orbit of [x] is uniformly distributed in 
all of T\G, with respect to the G-invariant volume on T\G. That is, 
letting 
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• Xt — xu f and 

• HL{f) = \j L f{[x t ])dt, 

we wish to show that the measures \xl converge to vol r \ G , as L — > oo. 

Assume, for simplicity, that Y\G is compact (see A). Then the 
space of probability measures on T\G is compact (in an appropriate 
weak* topology), so it suffices to show that 

if /zl„ is any convergent sequence, then the limit /ioo is volr\G- 

It is easy to see that /Zoo is zt'-invariant. Assume for simplicity, that it 
is also ergodic (see B). Then Ratner's Measure Classification Theorem 
(1.3.7) implies that there is a connected, closed subgroup S of G, and 
some point x' of G, such that 

f ) {u'jteR C S, 

2) the image [x'S] of x'S in T\G is closed, and has finite 5-invariant 
volume, and 

3) Moo = VOl[ x / S ]. 

It suffices to show that [x] € [x'S], for then (C) implies that S — G, so 

Moo = VOl^s] = VO\ x , G] = VOlp\ G , 

as desired. 

To simplify the remaining details, let us assume, for the moment, 
that S is trivial, so Moo is the point mass at the point [x'\. (Actually, 
this is not possible, because {«'} C S, but let us ignore this incon- 
sistency.) This means, for any neighborhood S of [x'], no matter how 
small, that the orbit of [x] spends more than 99% of its life in S. By 
Polynomial Divergence of Orbits (cf. 1.5.18), this implies that if we 
enlarge S slightly, then the orbit is always in S. Let S be the inverse 
image of S in G. Then, for some connected component S° of S, we 
have xu l e S° , for all t. But 5° is a small set (it has the same diameter 
as its image S in T\G), so this implies that xu* is a bounded function 
of t. A bounded polynomial is constant, so we conclude that 

xu l — x for all t€t. 

Because [x'\ is in the closure of the orbit of [x] , this implies that [a;] = 
[x 1 ] G [x'S], as desired. 

To complete the proof, we point out that a similar argument applies 
even if S is not trivial. We are ignoring some technicalities, but the idea 
is simply that the orbit of [x] must spend more than 99% of its life very 
close to [x'S]. By Polynomial Divergence of Orbits, this implies that 
the orbit spends all of its life fairly close to [x'S]. Because the distance 
to [x'S] is a polynomial function, we conclude that it is a constant, and 
that this constant must be 0. So [a;] G [x'S], as desired. □ 
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The following two remarks indicate how to eliminate the assump- 
tions (A) and (B) from the proof of (1.3.5). 

(1.9.1) Remark. If T\G is not compact, we consider its one-point 
compactification X — (T\G) U {oo}. Then 

• the set Prob(A) of probability measures on X is compact, and 

• Prob(r\G) = {/i€ Prob(A) | fi({oo}) = }. 

Thus, we need only show that the limit measure gives measure 
to the point oo. In spirit, this is a consequence of the Polynomial Di- 
vergence of Orbits, much as in the above proof of (1.3.5), putting oo in 
the role of x' . It takes considerable ingenuity to make the idea work, 
but it is indeed possible. A formal statement of the result is given in 
the following theorem. 

(1.9.2) Theorem (Dani-Margulis). Suppose 

• T is a lattice in a Lie group G, 

• u* is a unipotent one-parameter subgroup of G, 

• xe r\G, 

• e > 7 and 

• A is the Lebesgue measure on M. 

Then there is a compact subset K of T\G, such that 

A{ t € [0, L] I xu l 4 K } 
hmsup — L j < e. 

(1.9.3) Remark. Even if the limit measure /Xoo is not ergodic, Rat- 
ner's Measure Classification Theorem tells us that each of its ergodic 
components is homogeneous. That is, for each ergodic component fi z , 
there exist 

• a point x z £ G, and 

• a closed, connected subgroup S z of G, 
such that 

1) K} teR c s z , 

2) the image [a^S^] of x z S z in T\G is closed, and has finite S z - 
invariant volume, and 

3) [i z = vo\ xSz ]. 

Arguments from algebra, based on the Borel Density Theorem, tell us 
that: 

a) up to conjugacy, there are only countable many possibilities for 
the subgroups S z (see Exer. 4.7#7), and 

b) for each subgroup S z , the point x z must belong to a countable 
collection of orbits of the normalizer Na{S z ). 
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The singular set <S(w*) corresponding to u* is the union of all of these 
countably many Nq{S z )-oibits for all of the possible subgroups S z . 
Thus: 

1) <S(w*) is a countable union of lower-dimensional submanifolds of 
T\G, and 

2) if u' is any u'-invariant probability measure on T\G, such that 
/Lt'(«S(u*)) = 0, then // is the Lebesgue measure. 

So we simply wish to show that (<S(u*)) = 0. 

This conclusion follows from the polynomial speed of unipotent 
flows. Indeed, for every e > 0, because x ^ <S>(u*), one can show there 
is an open neighborhood S of <S(u*), such that 

\{t e [0,L] | xu* e 5} , r n , in/l , 

— L ; ' ' -<e for every L > 0, (1.9.4) 

where A is the Lebesgue measure on R. 

For many applications, it is useful to have the following stronger 
("uniform") version of the Equidistribution Theorem (1.3.4): 

(1.9.5) Theorem. Suppose 

• T is a lattice in a connected Lie group G, 

• li is the G -invariant probability measure on T\G, 

• {u^} is a sequence of one-parameter unipotent subgroups of G, 
converging to a one-parameter subgroup u* {that is, u l n — * tt* for 
all t), 

• {x n } is a convergent sequence of points in T\G, such that lim„^oo x r . 



lim -!- / f{x n u t n )dt= { fdfj,. 



• {L n } is a sequence of real numbers tending to oo ; and 

• f is any bounded, continuous function on T\G. 
Then 

r 

Exercises for §1.9. 

#1. Reversing the logical order, prove that Thm. 1.9.2 is a corollary 

of the Equidistribution Theorem (1.3.4). 
#2. Suppose S is a subgroup of G, and H is a subgroup of S. Show, 

for all g e N G (S) and all he H, that Sgh = Sg. 

Brief history of Ratner's Theorems 

In the 1930's, G. Hcdlund [21, 22, 23] proved that if G = SL(2,R) 
and T\G is compact, then unipotent flows on T\G are ergodic and 
minimal. 
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It was not until 1970 that H. Furstenberg [19] proved these flows are 
uniquely ergodic, thus establishing the Measure Classification Theorem 
for this case. At about the same time, W. Parry [37, 38] proved an Orbit 
Closure Theorem, Measure Classification Theorem, and Equidistribu- 
tion Theorem for the case where G is nilpotent, and G. A. Margulis [28] 
used the polynomial speed of unipotent flows to prove the important 
fact that unipotent orbits cannot go off to infinity. 

Inspired by these and other early results, M. S. Raghunathan con- 
jectured a version of the Orbit Closure Theorem, and showed that it 
would imply the Oppcnheim Conjecture. Apparently, he did not pub- 
lish this conjecture, but it appeared in a paper of S. G. Dani [7] in 1981. 
In this paper, Dani conjectured a version of the Measure Classification 
Theorem. 

Dani [6] also generalized Furstenberg's Theorem to the case where 
r\SL(2,R) is not compact. Publications of R. Bowen [4], S. G. Dani 
[6, 7], R. Ellis and W. Perrizo [17], and W. Veech [60] proved further 
generalizations for the case where the unipotent subgroup U is horo- 
spherical (see 2.5.6 for the definition). (Results for horosphericals also 
follow from a method in the thesis of G. A. Margulis [27, Lem. 5.2] (cf. 
Exer. 5.7#5).) 

M. Ratner began her work on the subject at about this time, prov- 
ing her Rigidity Theorem (1.5.3), Quotients Theorem (1.5.9), Joinings 
Theorem (1.5.30), and other fundamental results in the early 1980's 
[41, 42, 43]. (See [44] for an overview of this early work.) Using Rat- 
ner's methods, D. Wittc [61, 62] generalized her rigidity theorem to 
all G. 

S. G. Dani and J. Smillie [16] proved the Equidistribution Theo- 
rem when G = SL(2,R). S. G. Dani [8] showed that unipotent orbits 
spend only a negligible fraction of their life near infinity. A. Starkov 
[57] proved an orbit closure theorem for the case where G is solvable. 

Using unipotent flows, G. A. Margulis' [29] proved the Oppenheim 
Conjecture (1-2.2) on values of quadratic forms. He and S. G. Dani 
[12, 13, 14] then proved a number of results, including the first example 
of an orbit closure theorem for actions of non-horosphcrical unipotent 
subgroups of a semisimplc Lie group namely, for "generic" one- 
parameter unipotent subgroups of SL(3,R). (G. A Margulis [32, §3.8, 
top of p. 319] has pointed out that the methods could yield a proof of 
the general case of the Orbit Closure Theorem.) 

Then M. Ratner [45, 46, 47, 48] proved her amazing theorems 
(largely independently of the work of others), by expanding the ideas 
from her earlier study of horocycle flows. (In the meantime, N. Shah [55] 
showed that the Measure Classification Theorem implied an Equidis- 
tribution Theorem for many cases when G = SL(3, R).) 
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Ratner's Theorems were soon generalized to p-adic groups, by 
M. Ratner [51] and, independently, by G. A. Margulis and G. Tomanov 
[33, 34]. N. Shah [56] generalized the results to subgroups generated by 
unipotent elements (1.1.19). (For connected subgroups generated by 
unipotent elements, this was proved in Ratner's original papers.) 

Notes 

§1.1. See [2] for an excellent introduction to the general area of 
flows on homogeneous spaces. Surveys at an advanced level are given 
in [9, 11, 24, 31, 58]. Discussions of Ratner's Theorems can be found in 
[2, 9, 20, 50, 52, 58]. 

Raghunathan's book [40] is the standard reference for basic prop- 
erties of lattice subgroups. 

The dynamical behavior of the geodesic flow can be studied by 
associating a continued fraction to each point of T\G. (See [1] for an 
elementary explanation.) In this representation, the fact that some or- 
bit closures are fractal sets (1.1.9) is an easy observation. 

See [40, Thms. 1.12 and 1.13, pp. 22-23] for solutions of Ex- 
ers. 1.1#23 and 1.1#24. 

§1.2. Margulis' Theorem on values of quadratic forms (1-2.2) was 
proved in [29], by using unipotent flows. For a discussion and history 
of this theorem, and its previous life as the Oppcnhcim Conjecture, see 
[32]. An elementary proof is given in [14], [2, Chap. 6], and [10]. 

§1.3. M. Ratner proved her Measure Classification Theorem (1.3.7) 
in [45, 46, 47]. She then derived her Equidistribution Theorem (1.3.4) 
and her Orbit Closure Theorem (1.1.14) in [48]. A derivation also ap- 
pears in [15], and an outline can be found in [33, §11]. 

In her original proof of the Measure Classification Theorem, Ratner 
only assumed that T is discrete, not that it is a lattice. D. Witte [63, 
§3] observed that discreteness is also not necessary (Rem. 1.3.10(1)). 

See [39, §12] for a discussion of Choquet's Theorem, including a 
solution ofExer. 1.3#6. 

§1.4. The quantitative version (1.4.1) of Margulis' Theorem on 
values of quadratic forms is due to A. Eskin, G. A. Margulis, and 
S. Mozes [18]. See [32] for more discussion of the proof, and the partial 
results that preceded it. 

See [26] for a discussion of Quantum Unique Ergodicity and related 
results. Conjecture 1.4.6 (in a more general form) is due to Z. Rudnick 
and P. Sarnack. Theorem 1.4.8 was proved by E. Lindenstrauss [26]. 
The crucial fact that hp,(at) ^ was proved by J. Bourgain and E. Lin- 
denstrauss [3]. 

The results of §1.4C are due to H. Oh [35]. 
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§1.5. The Ratner Rigidity Theorem (1.5.3) was proved in [41]. 
Remark 1.5.4(1) was proved in [61, 62]. 

Flows by diagonal subgroups were proved to be Bernoulli (see 1.5.4(2)) 
by S. G. Dani [5], using methods of D. Ornstein and B. Weiss [36]. 

The Ratner Quotients Theorem (1.5.9) was proved in [42], together 
with Cor. 1.5.12. 

The crucial property (1.5.18) of polynomial divergence was intro- 
duced by M. Ratner [41, §2] for unipotent flows on homogeneous spaces 
of SL(2, M). Similar ideas had previously been used by Margulis in [28] 
for more general unipotent flows. 

The Shearing Property (1.5.20 and 1.5.21) was introduced by 
M. Ratner [42, Lem. 2.1] in the proof of her Quotients Theorem (1.5.9), 
and was also a crucial ingredient in the proof [43] of her Joinings The- 
orem (1.5.30). She [43, Defn. 1] named a certain precise version of this 
the "H-property," in honor of the horocycle flow. 

An introduction to Nonstandard Analysis (the rigorous theory of 
infinitesimals) can be found in [53] or [59]. 

Lusin's Theorem (Exer. 1.5#21) and the decomposition of a mea- 
sure into a singular part and an absolutely continuous part (see 
Exer. 1.5#29) appear in many graduate analysis texts, such as [54, 
Thms. 2.23 and 6.9]. 

§1.6. The generalization (1.6.2) of the Shearing Property to other 
Lie groups appears in [61, §6], and was called the "Ratner property." 
The important extension (1.6.7) to transverse divergence of actions of 
higher-dimensional unipotent subgroups is implicit in the "R-property" 
introduced by M. Ratner [45, Thm. 3.1]. In fact, the R-property com- 
bines (1.6.7) with polynomial divergence. It played an essential role in 
Ratner's proof of the Measure Classification Theorem. 

The arguments used in the proofs of (1.6.5) and (1.6.10) appear in 
[49, Thm. 4.1]. 

§1.7. Theorem 1.7.1 was proved by S. G. Dani [6], using meth- 
ods of H. Furstenberg [19]. Elementary proofs based on Ratner's ideas 
(without using entropy) can be found in [49], [2, §4.3], [20], and [58, 
§16]. 

The entropy estimates (1.7.4) and (1.8.4) are special cases of a 
result of G. A. Margulis and G. Tomanov [33, Thm. 9.7]. (Margulis 
and Tomanov were influenced by a theorem of F. Ledrappier and L.- 
S. Young [25].) We discuss the Margulis- Tomanov result in §2.5, and 
give a sketch of the proof in §2.6. 

§1.8. The subgroup V will be called S- in §5.4. The proof of 
Lem. 1.8.3 essentially amounts to a verification of Eg. 5.4.3(5). 
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The key point (1.8.5) in the proof of Cor. 1.8.1 is a special case of 
Prop. 5.6.1. 

§1.9. G. A. Margulis [28] proved a weak version of Thm. 1.9.2 in 
1971. Namely, for each x € T\G, he showed there is a compact subset K 
of T\G, such that 

{t e [0,oo) I [xu*] e K} (1.10.6) 
is unbounded. The argument is elementary, but ingenious. A very nice 
version of the proof appears in [14, Appendix] (and [2, §V.5]). 

Fifteen years later, S. G. Dani [8] refined Margulis' proof, and ob- 
tained (1.9.2), by showing that the set (1.10.6) not only is unbounded, 
but has density > 1 — e. The special case of Thm. 1.9.2 in which 
G = SL(2, M) and T = SL(2, Z) can be proved easily (see [49, Thm. 3.1] 
or [58, Thm. 12.2, p. 96]). 

The uniform version (1.9.5) of the Equidistribution Theorem was 
proved by S. G. Dani and G. A. Margulis [15]. The crucial inequal- 
ity (1.9.4) is obtained from the Dani-Margulis linearization method 
introduced in [15, §3]. 
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CHAPTER 2 



Introduction to Entropy 

The entropy of a dynamical system can be intuitively described 
as a number that expresses the amount of "unpredictability" in the 
system. Before beginning the formal development, let us illustrate the 
idea by contrasting two examples. 

2.1. Two dynamical systems 

(2.1.1) Definition. In classical ergodic theory, a dynamical system 

(with discrete time) is an action of Z on a measure space, or, in other 
words, a measurable transformation T: £1 — > O on a measure space fi. 
(The intuition is that the points of are in motion. A particle that 
is at a certain point w £ will move to a point T(uj) after a unit of 
time.) We assume: 

1) T has a (measurable) inverse T _1 : O — ► SI, and 

2) there is a T-invariant probability measure fi on 0. 

The assumption that [i is T-invariant means /j(T(A)) — fi(A), for 
every measurable subset A of ft. (This generalizes the notion of an 
incompressible fluid in fluid dynamics.) 

(2.1.2) Example (Irrational rotation of the circle). Let T = M/Z be 
the circle group; for any j3 <G R, we have a measurable transformation 
T p : T -» T defined by 

!>(*)=* + /?. 

(The usual arc-length Lebesgue measure is Tjg-invariant.) In physical 
terms, we have a circular hoop of circumference 1 that is rotating at a 
speed of (3 (see Fig. 2.1A). Note that we are taking the circumference, 
not the radius, to be 1. 
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Figure 2.1A. Tp moves each point on the the circle a 
distance of [3. In other words, Tjg rotates the circle 
360/3 degrees. 

If [3 is irrational, it is well known that every orbit of this dynamical 
system is uniformly distributed in T (see Exer. 2) (so the dynamical 
system is uniquely ergodic). 

(2.1.3) Example (Bernoulli shift). Our other basic example comes 
from the study of coin tossing. Assuming we have a fair coin, this is 
modeled by a two event universe C = {H,T}, in which each event has 
probability 1/2. The probability space for tossing two coins (indepen- 
dently) is the product space C x C, with the product measure (each of 
the four possible events has probability (1/2) 2 = 1/4). For n tosses, 
we take the product measure on C n (each of the 2™ possible events has 
probability (1/2)"). Now consider tossing a coin once each day for all 
eternity (this is a doubly infinite sequence of coin tosses) . The proba- 
bility space is an infinite cartesian product 

C°° = {/: Z->C} 

with the product measure: for any two disjoint finite subsets H and T 
of Z, the probability that 

• f(n) = H for all n £ H, and 

• f(n) = T for all n e T 

is exactly (1/2)I H M T I. 

A particular coin-tossing history is represented by a single element 
/ G C°°. Specifically, /(0) is the result of today's coin toss, f(n) is 
the result of the toss n days from now (assuming n > 0), and /(— n) 
is the result of the toss n days ago. Tomorrow, the history will be 
represented by an element g € C°° that is closely related to /, namely, 
f(n + 1) = g(n) for every n. (Saying today that "I will toss a head 
n+1 days from now" is equivalent to saying tomorrow that "I will toss 
a head n days from now.") This suggests a dynamical system (called 
a Bernoulli shift) that consists of translating each sequence of H's 
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FIGURE 2. IB. Baker's Transformation: the right half 
of the loaf is placed on top of the left, and then the 
pile is pressed down to its original height. 



and T's one notch to the left: 

T B cm: C°° - C°° is defined by (T Bem /)(n) = f(n + 1). 

It is well known that almost every coin-tossing history consists (in the 
limit) of half heads and half tails. More generally, it can be shown that 
almost every orbit in this dynamical system is uniformly distributed, 
so it is "ergodic." 

It is also helpful to see the Bernoulli shift from a more concrete 
point of view: 

(2.1.4) Example (Baker's Transformation). A baker starts with a 
lump of dough in the shape of the unit square [0, l] 2 . She cuts the 
dough in half along the line x = 1/2 (see Fig. 2. IB), and places the 
right half [1/2, 1] x [0, 1] above the left half (to become [0, 1/2] x [1, 2]). 
The dough is now 2 units tall (and 1/2 unit wide). She then pushes 
down on the dough, reducing its height, so it widens to retain the same 
area, until the dough is again square. (The pushing applies the linear 
map (x,y) \— > (2x,y/2) to the dough.) More formally, 

rp , x f(2z,y/2) ifx<l/2 
Bakc( " tV) = 1(2.-1, (y + l)/2) if ^ > 1/2. (2 - L5) 

(This is not well defined on the set {x = 1/2} of measure zero.) 

Any point (x, y) in [0, l] 2 can be represented, in binary, by two 
strings of 0's and l's: 



(x, y) = (O.x xix 2 . . . , 0.yiy 2 y3 ■■■), 
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and we have 

T Bake (0.x XiX 2 . . . , 0.yit/ 2 ?/3 • • •) = {0-XiX 2 0. £02/12/22/3 
so we see that 

T Bc rn is isomorphic to Teakc 
(modulo a set of measure zero), 

by identifying / e C°° with 

(0./(0) /(T) /(2) • • • , 0./PT) /p2) /p3) . . .) , 
where H = and T = 1 (or vice- versa). 

Exercises for §2.1. 

#1. Show (3 is rational if and only if there is some positive integer k, 
such that (T p ) k (x) = x for all x G T. 

#2. Show that if (3 is irrational, then every orbit of Tp is uniformly 
distributed on the circle; that is, if / is any arc of the circle, and 
x is any point in T, show that 

Jim * = length(J). 

[Hint: Exer. 1.3#1.] 



...), 
(2.1.6) 



2.2. Unpredictability 

There is a fundamental difference between our two examples: the 
Bernoulli shift (or Baker's Transformation) is much more "random" 
or "unpredictable" than an irrational rotation. (Formally, this will be 
reflected in the fact that the entropy of a Bernoulli shift is nonzero, 
whereas that of an irrational rotation is zero.) Both of these dynamical 
systems are deterministic, so, from a certain point of view, neither is 
unpredictable. But the issue here is to predict behavior of the dynam- 
ical system from imperfect knowledge, and these two examples look 
fundamentally different from this point of view. 

(2.2.1) Example. Suppose we have know the location of some point x 
of T to within a distance of less than 0.1; that is, we have a point 
x G T, and we know that d(x,x ) < 0.1. Then, for every n, we 
also know the location of T^(x) to within a distance of 0.1. Namely, 
d(Tp(x),Tp(x )) < 0.1, because Tp is an isometry of the circle. Thus, 
we can predict the location of Tp(x) fairly accurately. 

The Baker's Transformation is not predictable in this sense: 



o 




Figure 2.2A. Kneading the dough stretches a circular 
disk to a narrow, horizontal ellipse. 



(2.2.2) Example. Suppose there is an impurity in the baker's bread 
dough, and we know its location to within a distance of less than 0.1. 
After the dough has been kneaded once, our uncertainty in the horizon- 
tal direction has doubled (see Fig. 2. 2 A). Kneading again doubles the 
horizontal uncertainty, and perhaps adds a second possible vertical lo- 
cation (if the cut {x = 1/2} goes through the region where the impurity 
might be). As the baker continues to knead, our hold on the impurity 
continues to deteriorate (very quickly — at an exponential rate!). Af- 
ter, say, 20 kneadings, the impurity could be almost anywhere in the 
loaf — every point in the dough is at a distance less than 0.001 from a 
point where the impurity could be (see Excr. 1). In particular, we now 
have no idea whether the impurity is in the left half of the loaf or the 
right half. 

The upshot is that a small change in an initial position can quickly 
lead to a large change in the subsequent location of a particle. Thus, 
errors in measurement make it impossible to predict the future course 
of the system with any reasonable precision. (Many scientists believe 
that weather forecasting suffers from this difficulty — it is said that 
a butterfly flapping its wings in Africa may affect the next month's 
rainfall in Chicago.) 

To understand entropy, it is important to look at unpredictability 
from a different point of view, which can easily be illustrated by the 
Bernoulli shift. 

(2.2.3) Example. Suppose we have tossed a fair coin 1000 times. 
Knowing the results of these tosses does not help us predict the next 
toss: there is still a 50% chance of heads and a 50% chance of tails. 

More formally, define a function \\ C°° — > {H,T} by x(f) = /(0)- 
Then the values 

x(T- W0 V)),x(T-™(f)), X (T- 99 V)),---,x(T-\f)) 
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give no information about the value of x(/). Thus, Teem is quite un- 
predictable, even if we have a lot of past history to go on. As we will 
see, this means that the entropy of Teem is not zero. 

(2.2.4) Example. In contrast, consider an irrational rotation Tp. For 
concreteness, let us take (3 — V3/100 = .01732 . . ., and, for convenience, 
let us identify T with the half-open interval [0, 1) (in the natural way). 
Let x : [0, 1) — * {0, 1} be the characteristic function of [1/2, 1). Then, 
because jjss 1/60, the sequence 

consists of alternating strings of about thirty 0's and about thirty l's. 
Thus, 

if x{Tp\x)) = and x(2$(s)) = 1, then 
we know that x(T|(a;)) = 1 for k = 1, . . . , 25. 

So the results are somewhat predictable. (In contrast, consider the 
fortune that could be won by predicting 25 coin tosses on a fairly regular 
basis!) 

But that is only the beginning. The values of 

x(T- 1000 (x)), X (T-" 9 (x)), X (V 98 (x)), . . . , X (Tj\x)) 

can be used to determine the position of x fairly accurately. Using this 
more subtle information, one can predict far more than just 25 values 
of x — it is only when Tg(x) happens to land very close to or 1 /2 that 
the value of x(TJ}(x)) provides any new information. Indeed, knowing 
more and more values of x(T^(x)) allows us to make longer and longer 
strings of predictions. In the limit, the amount of unpredictability goes 
to 0, so it turns out that the entropy of Tp is 0. 

We remark that the relationship between entropy and unpre- 
dictability can be formalized as follows (see Exer. 2.4#12). 

(2.2.5) Proposition. The entropy of a transformation is if and only 
if the past determines the future (almost surely). 

More precisely, the entropy of T is if and only if, for each par- 
tition Soft! into finitely many measurable sets, there is a conull sub- 
set fi', such that, for x, y £ Q! , 

ifT k {x) ~ T k {y), for all k<0, then T k (x) ~ T k (y), for all k, 

where ~ is the equivalence relation corresponding to the partition S: 
namely, x ~ y if there exists 4e5 with {x, y} C A. 



(2.2.6) Example. 
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Figure 2.2B. The inverse image of horizontal line segments. 

1) Knowing the entire past history of a fair coin does not tell us what 
the next toss will be, so, in accord with Eg. 2.2.3, Prop. 2.2.5 
implies the entropy of TB crn is not 0. 

2) For the Baker's Transformation, let 

S = {[0,1/2) x [0,1], [1/2,1] x [0,1]} 

be the partition of the unit square into a left half and a right 
half. The inverse image of any horizontal line segment lies en- 
tirely in one of these halves of the square (and is horizontal) (see 
Fig. 2.2B), so (by induction), if x and y lie on the same hori- 
zontal line segment, then T Bakc {x) ~ T Bake (y) for all k < 0. On 
the other hand, it is (obviously) easy to find two points x and y 
on a horizontal line segment, such that x and y are in oppo- 
site halves of the partition. So the past does not determine the 
present (or the future). This is an illustration of the fact that the 
entropy h(T Bakc ) of T Bako is not (see 2.2.2). 

3) Let S = {/, T \ /}, for some (proper) arc / of T. If (3 is irrational, 
then, for any x G T, the set { T^{x) \ k < } is dense in T. From 
this observation, it is not difficult to sec that if T^(x) ~ T^(y) 
for all k < 0, then x = y. Hence, for an irrational rotation, the 
past does determine the future. This is a manifestation of the 
fact that the entropy h(T f3 ) ofTp is (see 2.2.4). 

Exercise for §2.2. 

#1. Show, for any x and y in the unit square, that there exists x' , 
such that d(x,x') < 0.1 and d(T|° ke (a;), y) < 0.01. 

2.3. Definition of entropy 

The fundamental difference between the behavior of the above two 
examples will be formalized by the notion of the entropy of a dynamical 
system, but, first, we define the entropy of a partition. 
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(2.3.1) Remark. Let us give some motivation for the following defini- 
tion. Suppose we are interested in the location of some point lo in some 
probability space fi. 

• If Q has been divided into 2 sets of equal measure, and we know 
which of these sets the point lo belongs to, then we have 1 bit of 
information about the location of lo. (The two halves of can be 
labelled '0' and '1'.) 

• If Q has been divided into 4 sets of equal measure, and we know 
which of these sets the point lo belongs to, then we have 2 bits of 
information about the location of lo. (The four quarters of Vt can 
be labelled '00', '01', '10', and '11'.) 

• More generally, if Q has been divided into 2 k sets of equal mea- 
sure, and we know which of these sets the point lo belongs to, 
then we have exactly k bits of information about the location 

of LO. 

• The preceding observation suggests that if f2 has been divided 
into n sets of equal measure, and we know which of these sets the 
point lo belongs to, then we have exactly log 2 n bits of information 
about the location of lo. 

• But there is no need to actually divide into pieces: if we have 
a certain subset A of ft, with ^(A) = l/n, and we know that lo 
belongs to A, then we can say that we have exactly log 2 n bits of 
information about the location of lo. 

• More generally, if we know that lo belongs to a certain subset A 
of O, and ^(A) = p, then we should say that we have exactly 
log 2 (l/p) bits of information about the location of lo. 

• Now, suppose has been partitioned into finitely many subsets 
Ai, . . . , A m , of measure pi, . . . ,Prm and that we will be told which 
of these sets a random point lo belongs to. Then the right-hand 
side of Eq. (2.3.3) is the amount of information that we can expect 
(in the sense of probability theory) to receive about the location 

of LO. 

(2.3.2) Definition. Suppose S = {Ai, . . . ,A m } is a partition of a 
probability space (f2, /x) into finitely many measurable sets of measure 
p-i,...,Pm respectively. (Each set Ai is called an atom of S.) The 
entropy of this partition is 




(2.3.3) 



2.3. Definition of entropy 
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If pi — 0, then pi log(l/pi) = 0, by convention. Different authors may 
use different bases for the logarithm, usually either e or 2, so this def- 
inition can be varied by a scalar multiple. Note that entropy is never 
negative. 

(2.3.4) Remark. Let us motivate the definition in another way. Think 
of the partition S as representing an experiment with m (mutually 
exclusive) possible outcomes. (The probability of the i th outcome is pi ) 
We wish H(S) to represent the amount of information one can expect to 
gain by performing the experiment. Alternatively, it can be thought of 
as the amount of uncertainty regarding the outcome of the experiment. 
For example, H({Ct}) = 0, because we gain no new information by 
performing an experiment whose outcome is known in advance. 

Let us list some properties of H that one would expect, if it is to 
fit our intuitive understanding of it as the information gained from an 
experiment. 

1) The entropy does not depend on the particular subsets cho- 
sen for the partition, but only on their probabilities. Thus, 
for pi , . . . , p n > with ^\ pi = 1 , we have a real number 
H(pi,...,p n ) > 0. Furthermore, permuting the probabilities 
Pi, . . . ,p n does not change the value of the entropy H(p\, . . . ,p n ). 

2) An experiment yields no information if and only if we can predict 
its outcome with certainty. Therefore, we have H(pi, . . . ,p n ) = 
if and only if pi = 1 for some i. 

3) If a certain outcome of an experiment is impossible, then there is 
no harm in eliminating it from the description of the experiment, 
so H(p 1 , . . . ,p n ,0) = H(p 1} . . . ,p n ). 

4) The least predictable experiment is one in which all outcomes 
are equally likely, so, for given n, the function H(pi, . . . ,p n ) is 
maximized at (1/n, . . . , 1/n). 

5) H(pi, . . . ,p n ) is a continuous function of its arguments. 

6) Our final property is somewhat more sophisticated, but still in- 
tuitive. Suppose we have two finite partitions S and S (not nec- 
essarily independent), and let 

Sv S = {AnB \ Ae S, BeS} 

be their join. The join corresponds to performing both experi- 
ments. We would expect to get the same amount of information 
by performing the two experiments in succession, so 

H{SVS) = H{S)+H(S | S), 

where H(S \ S), the expected (or conditional) entropy of S, 

given S, is the amount of information expected from performing 
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experiment <S, given that experiment S has already been per- 
formed. 

More precisely, if experiment S has been performed, then 
some event A has been observed. The amount of information Ha(S) 
now expected from experiment S is given by the entropy of the 
partition {B 3 n A}™ =1 of A, so 

n A) n(B 2 n A) n(B n nA) \ 

H{A) ' n(A) (i(A) J' 

This should be weighted by the probability of A, so 

H{S\S) = Y,Ha{S)^A). 

Aes 

An elementary (but certainly nontrivial) argument shows that any 
function H satisfying all of these conditions must be as described in 
Defn. 2.3.2 (for some choice of the base of the logarithm). 

The entropy of a partition leads directly to the definition of the 
entropy of a dynamical system. To motivate this, think about repeating 
the same experiment every day. The first day we presumably obtain 
some information from the outcome of the experiment. The second day 
may yield some additional information (the result of an experiment 
- such as recording the time of sunrise — may change from day to 
day). And so on. If the dynamical system is "predictable," then later 
experiments do not yield much new information. On the other hand, in 
a truly unpredictable system, such as a coin toss, we learn something 
new every day — the expected total amount of information is directly 
proportional to the number of times the experiment has been repeated. 
The total amount of information expected to be obtained after k daily 
experiments (starting today) is 

E k {T,S) = H(S V T-^S) V T- 2 {S) V • • • V T~^ k ~ 1 \S)) , 

where 

T e (S) = {T e (A) \AeS} 
(see Exer. 6). It is not difficult to see that the limit 

h(T,S)= lim ^2*1 

exists (see Exer. 14). The entropy of T is the value of this limit for the 
most effective experiment: 

(2.3.5) Definition. The entropy h(T) is the supremum of h(T,S) 
over all partitions S of into finitely many measurable sets. 



H A (S) = H 
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(2.3.6) Notation. The entropy of T may depend on the choice of the 
invariant measure fj,, so, to avoid confusion, it may sometimes be de- 
noted M T )- 

(2.3.7) Remark. The entropy of a flow is defined to be the entropy 
of its time-one map; that is, h({(pt}) = h(ipi). 

(2.3.8) Remark. Although we make no use of it in these lectures, we 
mention that there is also a notion of entropy that is purely topolog- 
ical. Note that E k (T,S) is large if the partition 

S V T^ 1 (S) V T- 2 (S) V • • • V T-( k -V(S) 

consists of very many small sets. In pure topology, without a measure, 
one cannot say whether or not the sets in a collection are "small," but 
one can say whether or not there are very many of them, and that 
is the basis of the definition. However, the topological definition uses 
open covers of the space, instead of measurable partitions of the space. 

Specifically, suppose T is a homeomorphism of a compact metric 
space X. 

1) For each open cover S of X, let N(S) be the minimal cardinality 
of a subcover. 

2) If S and S are open covers, let 

S V S = {An B \ A e S, BeS}. 

3) Define 

logJV(5vr- 1 (5)V---VT-( fc - 1 )(5)) 
"top (2 ) = sup hm . 

5 fc^oo K 

It can be shown, for every T-invariant probability measure /j, on X, 
that h„{T) < h top (T). 

Exercises for §2.3. 

#1. Show the function H defined in Eq. (2.3.3) satisfies the formulas 
in: 

(a) Rem. 2.3.4(2), 

(b) Rem. 2.3.4(3), 

(c) Rem. 2.3.4(4), 

(d) Rem. 2.3.4(5), and 

(e) Rem. 2.3.4(6). 

#2. Intuitively, it is clear that altering an experiment to produce more 
refined outcomes will not reduce the amount of information pro- 
vided by the experiment. Formally, show that if S C S, then 
H(S) < H(S). (We write S C S if each atom of S is a union of 
atoms of S (up to measure zero).) 
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#3. It is easy to calculate that the entropy of a combination of ex- 
periments is precisely the sum of the entropies of the individual 
experiments. Intuitively, it is reasonable to expect that indepen- 
dent experiments provide the most information (because they 
have no redundancy). Formally, show 

n 

ff(5iV--V5 n )<^if(5i). 

i=l 

[Hint: Assume n — 2 and use Lagrange Multipliers.] 

#4. Show H(S | S) < H(S). 
[Hint: Exer. 3.] 

#5. Show H{T l S) = H(S), for all £ E Z. 
[Hint: T e is measure preserving.] 

#6. For x,y £ Q, show that x and y are in the same atom of 
Vto T~ e (S) if and only if, for all I e {0, . . . , k - 1}, the two 
points T (x) and T (y) are in the same atom of S. 

#7. Show h(T) = hiT^ 1 ). 

[Hint: Exer. 5 implies E k (T,S) = E k (T~ 1 ,S).] 

#8. Show h(T £ ) = \t\ h(T), for all £ e Z. 

[Hint: For £ > 0, consider E k (T l , \J l r} T~ l S) .] 

#9. Show that entropy is an isomorphism invariant. That is, if 

• ip: (Cl,a) — ► (fi', /Li') is a measure-preserving map, such that 

. =T'(V>M) a.e., 

then h M (T) = V( T ')- 
#10. Show ft(T,5) < H(S). 
[Hint: Exers. 5 and 3.] 

#11. Show that if S C S, then h(T,S) < h(T,S). 

#12. Show \h(T,S) - h(T,S)\ < H(S \ S) + H(S \ S). 

[Hint: Reduce to the case where 5 C 5, by using the fact that 5 V 5 
contains both 5 and 5.] 

#13. Show that the sequence E k (T, S) is subadditive; that is, E k+e (T, S) < 
E k {T,S)+E l {T,S). 

#14. Show that linife^oo \E k (T,S) exists. 
[Hint: Exer. 13.] 

#15. Show that if T is an isometry of a compact metric space, then 

/kopcn - o. 

[Hint: If S e is the open cover by balls of radius e, then T e (S e ) = S e . 
Choose e to be a Lebesgue number of the open cover 5; that is, every 
ball of radius e is contained in some element of S.] 
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2.4. How to calculate entropy 

Definition 2.3.5 is difficult to apply in practice, because of the 
supremum over all possible finite partitions. The following theorem 
eliminates this difficulty, by allowing us to consider only a single parti- 
tion. (Sec Excr. 9 for the proof.) 

(2.4.1) Theorem. IfS is any finite generating partition forT, that 
is, if 

OO 

U Tk ( s ) 

k— — oo 

generates the a-algebra of all measurable sets (up to measure 0), then 

h(T) = h(T,S). 

(2.4.2) Corollary. For any (3 £ R, h(Tp) = 0. 

Proof. Let us assume [3 is irrational. (The other case is easy; see 
Exer. 1.) Let 

• / be any (nonempty) proper arc of T, and 

• <S = {/,T\/}. 

It is easy to see that if <S is any finite partition of T into connected sets, 
then #(S V S) < 2 + #S. Hence 

#(S V T^(S) V T- 2 {S) V • • • V T- (k - 1] {S)) < 2k (2.4.3) 

(see Exer. 3), so, using 2.3.4(4), we see that 



E k (T p ,S) ^ H(SVT- 1 (S) VT- 2 (S) V ■••VT 3 M (5)) 
11 1 
2/c 2/c 2/c . 



< H 

= E^M2fc) 
= log(2fc). 



Therefore 



M r, ) 5)=Ihn^^<Iiml2ffl=0. 

fc^oo k fe^oo k 



One can show that S is a generating partition for Tp (see Exer. 4), 
so Thm. 2.4.1 implies that h(T fj ) = 0. □ 



(2.4.4) Corollary. h(T Be m) = h{T B ^ e ) = log 2. 
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Proof. Because Teem and Teako are isomorphic (see 2.1.6), they have 
the same entropy (see Exer. 2.3#9). Thus, we need only calculate 

M^Bern)- 

Let 

A = { f e C°° | /(0) = 1 } and S = {A, C°° \ A}. 

Then 

S V T _1 (5) V T- 2 (S) V • • • V T- (fe - 1} (5) 
consists of the 2 k sets of the form 

C^eu...,**-! = { / e C°° | /(*) = e*, for J? = 0, 1, 2, . . . , * - 1 }, 
each of which has measure l/2 fe . Therefore, 

, fT „ .. g(W) ,. 2 fc -[^log2 fc ] 
/i(T Bc rn,<S) = hm = km 



k — *oo h k— »oo 

,. fclog2 , „ 
= lim -2- = log 2. 

fe^oo k 

One can show that S is a generating -partition for TB orn (see Exer. 5), 
so Thm. 2.4.1 implies that h(T Be m) = log 2. □ 

(2.4.5) Remark. One need not restrict to finite partitions; h(T) is the 
supremum of h(T, S), not only over all finite measurable partitions, but 
over all countable partitions S, such that H(S) < oo. When considering 
countable partitions (of finite entropy) , some sums have infinitely many 
terms, but, because the terms are positive, they can be rearranged at 
will. Thus, essentially the same proofs can be applied. 

We noted, in Prop. 2.2.5, that if h(T) = 0, then the past determines 
the present (and the future). That is, if we know the results of all 
past experiments, then we can predict the result of today's experiment. 
Thus, is the amount of information we can expect to get by performing 
today's experiment. More generally, Thm. 2.4.8 below shows that h(T) 
is always the amount of information we expect to obtain by performing 
today's experiment. 

(2.4.6) Example. As in Eg. 2.2.6(2), let S be the partition of the unit 
square into a left half and a right half. It is not difficult to see that 

x and y lie on the same horizontal line segment 

^ 7 1 B fc akcW^7 1 Bakc(2/)forallfc<0. 

(We ignore points for which one of the coordinates is a dyadic rational 
- they are a set of measure zero.) Thus, the results of past experi- 
ments tell us which horizontal line segment contains the point 10 (and 
provide no other information). The partition S cuts this line segment 
precisely in half, so the two possible results are equally likely in today's 
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experiment; the expected amount of information is log 2, which, as we 
know, is the entropy of Teako (see 2.4.4). 

(2.4.7) Notation. 

• Let S + — Vfci T e S. Thus, S + is the partition that corresponds 
to knowing the results of all past experiments (see Exer. 10). 

• Let H(S | S + ) denote the conditional entropy of a partition S 
with respect to S + . More precisely: 

o the measure \i has a conditional measure fj,A on each 
atom A of S + : 

o the partition S induces a partition Sa of each atom A of S + ; 
o we have the entropy H(Sa) (with respect to the probability 

measure /J. a)', and 
o H(S | S + ) is the integral of H(Sa) over all of 0. 

Thus, H(S | iS + ) represents the amount of information we expect to 
obtain by performing experiment S, given that we know all previous 
results of experiment S. 

See Exer. 11 for the proof of the following theorem. 

(2.4.8) Theorem. IfS is any finite generating partition forT, then 
h(T) = H(S | S+). 

Because T~ 1 5 + = S V S + , the following corollary is immediate. 

(2.4.9) Corollary. IfS is any finite generating partition forT, then 
h(T) = H{T- 1 S+\S+). 

Exercises for §2.4. 

#1. Show, directly from the definition of entropy, that if (3 is rational, 

then h{T p ) = 0. 
#2. Show that if 

• <S = {T \ 7, /}, where 7 is a proper arc of T, and 

• S is a finite partition of T into connected sets, 
then 

^2 (# components of C) < 2 + #5. 
cesvs 
#3. Prove Eq. (2.4.3). 
#4. Show that if 

• S = {T \ 7, 7}, where 7 is a nonempty, proper arc of T, and 

• (3 is irrational, 

then S is a generating partition for Tp. 

[Hint: If n is large, then Vfc=o T k S is a partition of T into small inter- 
vals. Thus, any open interval is a countable union of sets in Ufclo T k S.] 
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#5. Show that the partition S in the proof of Cor. 2.4.2 is a generating 

partition for Teem- 
#6. The construction of a Bernoulli shift can be generalized, by 

using any probability space in place of C. Show that if a Bernoulli 

shift T is constructed from a measure space with probabilities 

{Pi, ■ ■ ■ ,Pn}, then h(T) = H{p 1 , . . . ,p n ). 
#7. Show that if S is any finite, nonempty set of integers, then 

h(T,\J keS T k S) = h(T,S). 

#8. Suppose 

• S is a finite generating partition for T, 

• S is a finite partition of 0, and 

• e > 0. 

Show that there is a finite set S of integers, such that H(S | 
S s ) < e, where S s = \J eeS T e S. 
#9. Show that if S is a finite generating partition for T, then 
h(T) = h{T,S). 
[Hint: Exer. 8.] 

#10. For x,y £ 0, show that x and y belong to the same atom of iS + 
if and only if T k (x) and T k (y) belong to the same atom of 5, for 
every k < 0. 

#11. Show: 

(a) E k {T,S) = H(S)+Ei= 1 i H{S | V-=i^<S). 

(b) H(S I Vfci ^<S) is a decreasing sequence. 

(c) h(T,S)=H(S | 

(d) h(T,S) = H{T- 1 S+\S+). 

[Hint: You may assume, without proof, that 

H(S | \/Zi Te S) =Hm k ^ 00 H(S \ Vti^S), 
if the limit exists.] 

#12. Show h(T, S) = if and only if S C S + (up to measure zero). 
[Hint: Exer. 11.] 

2.5. Stretching and the entropy of a translation 

(2.5.1) Remark. 

1) Note that Tp is an isometry of T. Hence, Cor. 2.4.2 is a particular 
case of the general fact that if T is an isometry (and £1 is a 
compact metric space), then h{T) = (see Exers. 1 and 2.3#15). 

2) Note that Teake is far from being an isometry of the unit square. 
Indeed: 
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• the unit square can be foliated into horizontal line segments, 
and Teako stretches (local) distances on the leaves of this 
foliation by a factor of 2 (cf. 2.1.5), so horizontal distances 
grow exponentially fast (by a factor of 2") under iterates 
of T Ba k e ; and 

• distances in the complementary (vertical) direction are con- 
tracted exponentially fast (by a factor of 1/2"). 

It is not a coincidence that log 2, the logarithm of the stretching 
factor, is the entropy of Teako- 

The following theorem states a precise relationship between stretch- 
ing and entropy. (It can be stated in more general versions that apply 
to non-smooth maps, such as T^ake-) Roughly speaking, entropy is cal- 
culated by adding contributions from all of the independent directions 
that are stretched at exponential rates (and ignoring directions that 
are contracted). 

(2.5.2) Theorem. Suppose 

• Q = M is a smooth, compact manifold, 

• vol is a volume form on M, 

• T is a volume-preserving diffeomorphism, 

• T\, . . . ,Tk are [positive) real numbers, and 

• the tangent bundle TM is a direct sum of T -invariant subbundles 

£i, . . . ,£fc, such that \\dT{^)\\ — r,||£||, for each tangent vector^ e 
P. 

then 

h vo i{T)= ^(dim&) logr,. 

Ti>l 

(2.5.3) Example. If T is an isomctry of M, let n = 1 and £ X =TM. 
Then the theorem asserts that h(T) = 0. This establishes Rem. 2.5.1(1) 
in the special case where f2 is a smooth manifold and T is a diffeomor- 
phism. 

(2.5.4) Example. For T Bak o, let n = 2, t 2 = 1/2, 

Ei = {horizontal vectors} and £2 = {vertical vectors}. 

Then, if we ignore technical problems arising from the nondifferentiabil- 
ity of Teakc and the boundary of the unit square, the theorem confirms 
our calculation that fr(TBakc) = log 2 (see 2.4.4). 

For the special case where T is the translation by an element of G, 
Thm. 2.5.2 can be rephrased in the following form. 

(2.5.5) Notation. Suppose g is an element of G 
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• Let G+ ={«£(? limfe^-oo g~ k ug k = e}. (Note that k tends 
to negative infinity.) Then G + is a closed, connected subgroup 
of G (see Exer. 5). 

• Let 

J{g,G+) = |det((Ad 5 )| s+ )| 
be the Jacobian of g acting by conjugation on G + . 

(2.5.6) Remark. G + is called the (expanding) horo spherical sub- 
group corresponding to g. (Although this is not reflected in the nota- 
tion, one should keep in mind that the subgroup G+ depends on the 
choice of g.) Conjugation by g expands the elements of G + because, by 
definition, conjugation by g^ 1 contracts them. 

There is also a contracting horosphcrical subgroup G_ , consisting 
of the elements that are contracted by g. It is defined by 

G_ = { u e G | lim g~ k ug k = e }, 

k — >oo 

the only difference being that the limit is now taken as k tends to 
positive infinity. Thus, G_ is the expanding horosphcrical subgroup 
corresponding to g^ 1 . 

(2.5.7) Corollary. Let 

• G be a connected Lie group, 

• T be a lattice in G, 

• vol be a G-invariant volume on T\G, and 

• geG. 
Then 

Ko\{g) = log J{g,G + ), 
where, abusing notation, we write h vo \(g) for the entropy of the trans- 
lation T g : r\G — > r\G, defined by T g (x) = xg. 

(2.5.8) Corollary. If u e G is unipotent, then h vo \(u) = 0. 

(2.5.9) Corollary. // 

• G = SL(2,R), and 



then h vo \(a s ) — 2\s\. 

Corollary 2.5.7 calculates the entropy of g with respect to the nat- 
ural volume form on T\G. The following generalization provides an es- 
timate (not always an exact calculation) for other invariant measures. 
If one accepts that entropy is determined by the amount of stretching, 
in the spirit of Thm. 2.5.2 and Cor. 2.5.7, then the first two parts of the 
following proposition are fairly obvious at an intuitive level. Namely: 
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(1) The hypothesis of 2.5.11(1) implies that stretching along any di- 
rection in W will contribute to the calculation of h fl (a). This 
yields only an inequality, because there may be other directions, 
not along W, that also contribute to h fl (a); that is, there may 
well be other directions that are being stretched by a and belong 
to the support of /i. 

(2) Roughly speaking, the hypothesis of 2.5.11(2) states that any 
direction stretched by a and belonging to the support of \i must 
lie in W . Thus, only directions in W contribute to the calculation 
of hp (a). This yields only an inequality, because some directions 
in W may not belong to the support of (i. 

(2.5.10) Notation. Suppose 

• g is an element of G, with corresponding horospherical sub- 
group G + , and 

• IT is a connected Lie subgroup of G + that is normalized by g. 

Let 

J(g,W) = \det((Adg)\ M )\ 
be the Jacobian of g on W. Thus, 

logJ( 5 ,TT)=]Tlog|A|, 

A 

where the sum is over all eigenvalues of (Ad g)\n>, counted with multi- 
plicity, and to is the Lie algebra of W. 

(2.5.11) Proposition. Suppose 

• g is an element of G, 

• /i is an measure g-invariant probability measure on L\G, and 

• W is a connected Lie subgroup of G+ that is normalized by g. 
Then: 

1) If n is W -invariant, then h fl (g) > log J(g, W). 

2) If there is a conull, Borel subset f2 ofT\G, such that f2nxG_ C 
xW , for every x G fl, then h^(g) < log J(g, W). 

3) If the hypotheses of (2) are satisfied, and equality holds in its 
conclusion, then fi is W -invariant. 

See §2.6 for a sketch of the proof. 

Although we have no need for it in these lectures, let us state the 
following vast generalization of Thm. 2.5.2 that calculates the entropy 
of any diffcomorphism. 

(2.5.12) Notation. Suppose T is a diffeomorphism of a smooth mani- 
fold M. 
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1) for each x <G M and A > 0, we let 

E x (x) = (v & %M limsup M^hm < A 
Note that 

• E\(x) is a vector subspace of T X M, for each x and A, and 

• we have E\ 1 (x) c E\ 2 (x) if Ai < A 2 . 

2) For each A > 0, the multiplicity of A at x is 

m,(A) = min(dim.E;\(x) — dim£' M (x)). 

A 

By convention, m x (0) = dim Eg (x). 

3) We use 

Lyap(T,x) = {A>0|m x (A)^0} 
to denote the set of Lyapunov exponents of T at x. Note that 
J2xeLya P {T,x) m xW = dimM, so Lyap(T, x) is a finite set, for 
each x. 

(2.5.13) Theorem (Pcsin's Entropy Formula). Suppose 

• ft = M is a smooth, compact manifold, 

• vol is a volume form on M, and 

• T is a volume-preserving diffeomorphism. 
Then 

h m i(T)= ( m *wA rfvol(x). 

JM \AeLyap(7» / 

Exercises for §2.5. 

#1. Suppose 

• T is an isometry of a compact metric space fi, 

• u is a T-invariant probability measure on f2, and 

• {T~ k x}'%L 1 is dense in the support of fj,, for a.e. if!]. 
Use Thm. 2.4.8 (and Rem. 2.4.5) to show that h^T) = 0. 
[Hint: Choose a point of density xo for //, and let S be a countable 
partition of Q, such that H(S) < oo and lim x _> xo diam(>S(x)) = 0, 
where S(x) denotes the atom of S that contains x. Show, for a conull 
subset of fi, that each atom of S + is a single point.] 

#2. Let G = SL(2,R), and define u* and a s as usual (see 1.1.5). Show 
that if s > 0, then {u*} is the (expanding) horospherical subgroup 
corresponding to a s . 

#3. For each g e G, show that the corresponding horospherical sub- 
group G+ is indeed a subgroup of G. 
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#4. Given g e G, let 

0+ = { « G I lim^-oo w(Ad G 3 ) fe = }. (2.5.14) 

(a) Show that Q + is a Lie subalgebra of 0. 

(b) Show that if Adc g is diagonalizable over R, then Q + is the 
sum of the eigenspaces corresponding to eigenvalues of Ad^ g 
whose absolute value is (strictly) greater than 1. 

#5. Given g e G, show that the corresponding horospherical sub- 
group G + is the connected subgroup of G whose Lie algebra is Q + . 
[Hint: For u £ G+, there is some k G Z and some » 6 J, such that 
exp-u = 3~ fc wp fe .] 

#6. Derive Cor. 2.5.7 from Thm. 2.5.2, under the additional assump- 
tions that: 

(a) r\G is compact, and 

(b) Adc g is diagonalizable (over C) . 

#7. Derive Cor. 2.5.8 from Cor. 2.5.7. 

#8. Derive Cor. 2.5.9 from Cor. 2.5.7. 

#9. Give a direct proof of Cor. 2.5.8. 
[Hint: Fix 

• a small set £1 of positive measure, 

• n: Q — > Z + with xu n ^' £ f2 for a.e. a; G f2 and J^ndfi < oo, 

• A > 1, such that d(xu 1 ,jyu 1 ) < Xd(x,y) for x,y £ T\G, 

• a partition 5 of fi, such that diam(<S(x)) < eA - ™^ (seeExer. 10). 
Use the argument of Exer. 1.] 

#10. Suppose 

• fi is a precompact subset of a manifold M, 

• [i is a probability measure on 0, and 

Show there is a countable partition S of f2, such that 

(a) S has finite entropy, and 

(b) for a.e. x £ Q,, we have diamS(x) < e p ^ x \ where S(x) 
denotes the atom of S containing x. 

[Hint: For each n, there is a partition 5„ of f2 into sets of diameter less 
than e- (n+1 \ such that #5 n < C*e ndimM . Let S be the partition into 
the sets of the form B n P\ R n , where B n € S„ and R n = [n, n + 1). 
Then if (5) < if (5) + £^ =0 M#n) log #S„ < oo.] 

#11. Derive Cor. 2.5.7 from Prop. 2.5.11. 

#12. Use Prop. 2.5.11 to prove 

(a) Lem. 1.7.4, and 

(b) Lem. 1.8.4. 
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#13. Derive Thm. 2.5.2 from Thm. 2.5.13. 

2.6. Proof of the entropy estimate 

For simplicity, we prove only the special case where g is a diagonal 
matrix in G = SL(2,R). The same method applies in general, if ideas 
from the solution of Exer. 2.5#9 are added. 

(1.7.4') Proposition. Let G = SL(2,K), and suppose a is an a s - 
invariant probability measure on T\G. 

1) If /i is {u*} -invariant, then h(a s ,ti) = 2\s\. 

2) We have h^{a s ) < 2\s\. 

3) If hn{a s ) = 2\s\ (and s # 0), then fi is {u 1 } -invariant. 

(2.6.1) Notation. 

• Let U = {u*}. 

• Let v r be the opposite unipotent one-parameter subgroup (see 1.7.2). 

• Let a = a s , where s > is sufficiently large that e~ s < 1/10, say. 
Note that 

lim a~ k u t a k = e and lim a~ k v t a k = e. (2.6.2) 

k^ — oo k^oo 

• Let xo be a point in the support of \x. 

• Choose some small e > 0. 

• Let 

o U e = { it* | -e < t < e }, and 

o D be a small 2-disk through x that is transverse to the 
/7-orbits, 

so DU e is a neighborhood of xq that is naturally homeomorphic 
to D x U e . 

• For any subset A of DU ei and any x € D, the intersection A<~)xU e 
is called a plaque of A. 

(2.6.3) Lemma. There is an open neighborhood A of xo in DU e , such 
that, for any plaque F of A, and any k € Z + . 

if F n Aa k # 0, then F C Aa k . (2.6 .4) 

Proof. We may restate (2.6.4) to say: 

if Fa~ k n A # 0, then F a - k C A. 

Let A be any very small neighborhood of x - If Fa~ k intersects A , 
then we need to add it to A. Thus, we need to add 



Ax = [J I Fa- 



F is a plaque of Aq , 
Fa- k n A # 0, 
fc> 



2.6. Proof of the entropy estimate 



91 



Define A n+ i = [J < Fa 



This does not complete the proof, because it may be the case that, 
for some plaque F of Aq, a translate Fa~ k intersects A\, but does not 
intersect A . Thus, we need to add more plaques to A, and continue 
inductively: 

F is a plaque of A , 
Fa- k n An ^ 0, 
k > n 

• Let A = U^ =0 A n . 

It is crucial to note that we may restrict to k > n in the definition 
of A n+ i (see Exer. 1). Because conjugation by a~ k contracts distances 
along U exponentially (see 2.6.2), this implies that dianij4 is bounded 
by a geometric series that converges rapidly. By keeping diam A suffi- 
ciently small, we guarantee that A C DU e . □ 

(2.6.5) Notation. 

• Let 

S = {A, (r\G) x A}, 
where A was constructed in Lem. 2.6.3. (Technically, this is not 
quite correct — the proof of Lem. 2.6.7 shows that we should 
take a similar, but more complicated, partition of T\G.) 

• Let S+= V£U Sa k (cf. 2.4.7). 

• Let S + (x) be the atom of S + containing x, for each x € T\G. 

(2.6.6) Assumption. Let us assume that the measure a is measure 
for a. (The general case can be obtained from this by considering the 
ergodic decomposition of /x.) 

(2.6.7) Lemma. The partition S + is subordinate to U. That is, for 
a.e. x e T\G, 

1) S + (x) C xU , and 

2) more precisely, S + (x) is a relatively compact, open neighborhood 
of x (with respect to the orbit topology of xU). 

Proof. For a.e. x <G A, we will show that S + (x) is simply the plaque of A 
that contains x. Thus, (1) and (2) hold for a.e. x E A. By ergodicity, 
it immediately follows that the conditions hold for a.e. x G T\G. 

If F is any plaque of A, then (2.6.4) implies, for each k > 0, that 
F is contained in a single atom of Sa k . Therefore, F is contained in a 
single atom of S + . The problem is to show that S + (x) contains only a 
single plaque. 

Let V f _ — {v r } f r= _ f , and pretend, for the moment, that xoU e V e is 
a neighborhood of x . (Thus, we are we are ignoring {a s }, and pre- 
tending that G is 2-dimensional.) For k > 0, we know that conjugation 
by a k contracts {v r }, so Aa k is very thin in the {w r }-direction (and 
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correspondingly long in the {u*}-direction). In the limit, we conclude 
that the atoms of <S + are infinitely thin in the {w r }-direction. The union 
of any two plaques has a nonzero length in the {v r }-direction, so we 
conclude that an atom of S + contains only one plaque, as desired. 

To complete the proof, we need to deal with the {a s }-direction. 
(Unfortunately, this direction is not contracted by a k , so the argument 
of the preceding paragraph does not apply) To do this, we alter the 
definition of S. 

• Let S be a countable partition of D, such that H(S) < oo and 
lim x ^ Xo diam(S(x)) = 0. 

• Let S be the corresponding partition of A: 



• Let 5 = 5U{(r\G)\A}. 

Then S is a countable partition of T\G with H(S) < oo, so h fl (a) = 
if(5 + a _1 | S + ) (see 2.4.5 and 2.4.9). Ergodicity implies that xa k is 
close to Xq for some values of k. From the choice of S, this implies that 
S + (x) has small length in the {a s }-direction. In the limit, S + (x) must 
be infinitely thin in the {a s }-direction. □ 

Proof of 1.7.4'(1). Wc wish to show HiS+a- 1 | S+) = 2s (see 2.4.9). 
For any x e T\G, let 

• Hx be the conditional measure induced by \i on S + (x), and 

• A be the Haar measure (that is, the Lebesgue measure) on xU . 
Because S + (x) c xU, and fi is {/-invariant, we know that 

[j, x is the restriction of A to S + (x) (up to a scalar multiple). (2.6.8) 

Now S + C 5 + a _1 , so 5 + a _1 induces a partition S x of S + (x). By defini- 
tion, we have 



Note that, because translating by a transforms 5 + a _1 to S + , we have 

x { u e U | xu e S x (x) } a = xa{ u e U \ xau e S + (x) }. 
Conjugating by a expands A by a factor of e 2s , so this implies 



S = { {BU e ) n A | Be S}. 




where 



f(x) = Hx(S x (x)) 



X(S x (x)) 
\(S+(x))- 



(2.6.9) 






X(S+(x)) 



\(S+(x)) 
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Because < f(x) < 1 and e 2s is a constant, we conclude that 

(logA(5+M) -\og\(S + (x))Y e L^nCM), 
so Lem. 2.6.10 below implies 

log/rf/i = loge 2s = 2s, 



ir\G 

as desired. □ 

The following observation is obvious (from the invariance of /u) if 
tp € L 1 (T\G,ii). The general case is proved in Exer. 3.1#8. 

(2.6.10) Lemma. Suppose 

• [i is an a-invariant probability measure on T\G, and 

• ip is a real-valued, measurable function on T\G, 
such that 

{i>(xa)-ip(x)) + eL\r\G,»), 

where (a) + — max(a, 0). Then 

/ (ip(xa) — ip(x)) d(i(x) = 0. 
Jt\g 

Proof of 1.7.4'(2). This is similar to the proof of (1). Let X x = 
A/A(S + (x)) be the normalization of A to a probability measure on 
S + (x). (In the proof of (1), we had X x = \x x (see 2.6.8), so we did 
not bother to define A x .) Also, define 

f^{x) = (J, x (S x (x)) and f\(x) = X x (S x (x)). 

(In the proof of (1), we had / M = f\ (see 2.6.9); we simply called the 
function /.) 
We have 



M«) = HiS+a- 1 \S+) = - f log /„ d/i, 

Jr\G 

and the proof of (1) shows that 



- / log A d/i = 2s > 
Jr\G 



so it suffices to show 



/ log f\dfi< / log / M dfi. 
Jr\G Jr\G 

Thus, we need only show, for a.e. x £ T\G, that 

log/A dfi x < / log fnd/i x . (2.6.11) 

Js+(x) 



S+(x) JS+(x) 
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Write S x = {Ai, . . . , A n }. For y e A i: we have 

fx(y) = K{Ai) and f„(y) = fJ, x (Ai), 

so 

/ (log fx - log U ) d/^ = V log ^444 [i x ( Ai ) . 

Because 

— fJ, x \s±i) = 

i=\ 

the concavity of the log function implies 



i=l i) 



£]og^k(4) < log£^TT^) (2.6.12) 

= logl 
= 0. 

This completes the proof. □ 



Proof of 1.7.4'(3). Let /iy be the conditional measure induced by \x 
on the orbit xll. To show that /i is [/-invariant, we wish to show that 
fi is equal to A (up to a scalar multiple). 

We must have equality in the proof of 1.7.4'(2). Specifically, for a.e. 
x e r\G, we must have equality in (2.6.11), so we must have equality 
in (2.6.12). Because the log function is strictly concave, we conclude 
that 

K{Aj) = Xx(Aj) 
Hx{Ai) Hx{Aj) 

for all Since 

n n 

y^^x(Aj) = x x (s x ) = i = fj, x (s x ) = ~y]n x (Aj), 

we conclude that X x {Ai) = fi x (Ai). This means that nu(A) = X(A) for 
all atoms of S x . By applying the same argument with a k in the place 
of a (for all k e Z+), we conclude that fiu(A) = X(A) for all A in a 
collection that generates the a-algebra of all measurable sets in xll. 
Therefore [ijj = X. □ 



Notes 



95 



Exercise for §2.6. 

#1. In the proof of Lem. 2.6.3, show that if 

• F is a plaque of A , 

• 1 < k < n, and 

• Fa- k n A„ ^ 0, 

then Fa- k n (A U • • • U A„-i) ^ 0- 
[Hint: Induction on k.] 

Notes 

The entropy of dynamical systems is a standard topic that is dis- 
cussed in many textbooks, including [6, §4.3— §4.5] and [19, Chap. 4]. 

§2.1. Irrational rotations Tp and Bernoulli shifts Teem are stan- 
dard examples. The Baker's Transformation Teake is less common, but 
it appears in [2, p. 22], for example. 

§2.2. Proposition 2.2.5 appears in standard texts, including [19, 
Cor. 4.14.1]. 

§2.3. This material is standard (including the properties of en- 
tropy developed in the exercises). 

Our treatment of the entropy of a partition is based on [7]. The 
elementary argument mentioned at the end of Rem. 2.3.4(6) appears 
in [7, pp. 9-13]. 

It is said that the entropy of a dynamical system was first defined 
by A. N. Kolmogorov [8, 9], and that much of the basic theory is due 
to Ya. Sinai [17, 18]. 

Topological entropy was defined by R. L. Adler, A. G. Konhcim, 
and M. H. McAndrew [1]. Our discussion in Rem. 2.3.8 is taken from 
[5]. 

L. W. Goodwyn [4] proved the inequality h^(T) < h top (T). A 
simple proof of a stronger result appears in [15]. 

§2.4. This material is standard. 
Theorem 2.4.1 is due to Ya. Sinai. 

§2.5. Corollary 2.5.7 was proved by R. Bowen [3] when T\G is 
compact. The general case (when T\G has finite volume) was appar- 
ently already known to dynamicists in the Soviet Union. For example, 
it follows from the argument that proves [13, (8.35), p. 68]. 

A complete proof of the crucial entropy estimate (2.5.11) appears 
in [14, Thm. 9.7]. It is based on ideas from [11]. 

Pesin's Formula (2.5.13) was proved in [16]. Another proof appears 
in [12]. 

Exercise 2.5#10 is [12, Lem. 2]. 
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§2.6. This section is based on [14, §9]. 
Lemma 2.6.10 is proved in [10, Prop. 2.2]. 



References 



97 



References 

[1] R. L. Adler, A. G. Konhcim, and M. H. McAndrew: Topo- 
logical entropy. Trans. Amer. Math. Soc. 114 (1965), 309-319. 
MR 30 #5291 

[2] B. Bckka and M. Mayer: Ergodic Theory and Topological Dynamics 
of Group Actions on Homogeneous Spaces. London Math. Soc. Lec. 
Notes #269. Cambridge U. Press, Cambridge, 2000. ISBN 0-521- 
66030-0, MR 2002c:37002 

[3] R. Bowen: Entropy for group endomorphisms and homogeneous 
spaces. Trans. Amer. Math. Soc. 153 (1971), 401-414. MR 43 #469 

[4] L. W. Goodwyn: Topological entropy bounds measure-theoretic 
entropy. Proc. Amer. Math. Soc. 23 (1969), 679-688. MR 40 #299 

[5] B. Hasselblatt and A. Katok: Principal structures, in: B. Hassel- 
blatt and A. Katok, eds., Handbook of Dynamical Systems, Vol. 
1A. North-Holland, Amsterdam, 2002, pp. 1-203. ISBN 0-444- 
82669-6, MR 2004c:37001 

[6] A. Katok and B. Hasselblatt: Introduction to the Modern The- 
ory of Dynamical Systems. Encyclopedia of Mathematics and its 
Applications, 54. Cambridge Univ. Press, Cambridge, 1995. ISBN 
0-521-34187-6, MR 96c:58055 

[7] A. I. Khinchin: Mathematical Foundations of Information Theory, 
Dover, New York, 1957. ISBN 0-486-60434-9, MR 19,1148f 

[8] A. N. Kolmogorov: A new metric invariant of transient dynamical 
systems and automorphisms in Lebesgue spaces (Russian). Dokl. 
Akad. Nauk SSSR (N.S.) 119 (1958), 861-864. MR 21 #2035a 

[9] A. N. Kolmogorov: Entropy per unit time as a metric invariant 
of automorphisms (Russian). Dokl. Akad. Nauk SSSR 124 (1959), 
754-755. MR 21 #2035b 

[10] F. Ledrappier and J.-M. Strelcyn: A proof of the estimation from 
below in Pesin's entropy formula. Ergodic Th. Dyn. Sys. 2 (1982), 
203-219. MR 85f:58070 

[11] F. Ledrappier and L.-S. Young: The metric entropy of diffeomor- 
phisms. I. Characterization of measures satisfying Pesin's entropy 
formula. Ann. of Math. 122 (1985), no. 3, 509-539. MR 87i:58101a 

[12] R. Mane: A proof of Pesin's formula. Ergodic Th. Dyn. Sys. 
1 (1981), no. 1, 95-102 (errata 3 (1983), no. 1, 159-160). 
MR 83b:58042, MR 85f:58064 

[13] G. A. Margulis: On Some Aspects of the Theory of Anosov Sys- 
tems. Springer, Berlin, 2004. ISBN 3-540-40121-0, MR 2035655 



98 



2. Introduction to Entropy 



[14] G. A. Margulis and G. M. Tomanov: Invariant measures for actions 
of unipotent groups over local fields on homogeneous spaces, In- 
vent. Math. 116 (1994), 347-392. (Announced in C. R. Acad. Sci. 
Paris Ser. I Math. 315 (1992), no. 12, 1221-1226.) MR 94f:22016, 
MR 95k:22013 

[15] M. Misiurewicz: A short proof of the variational principle for a 
action on a compact space. Internal. Con], on Dynam. Sys. 
in Math. Physics (Rennes, 1975). Asterisque 40 (1976), 147-157. 
MR 56 #3250 

[16] Ja. B. Pesin: Characteristic Ljapunov exponents, and smooth er- 
godic theory (Russian). Uspehi Mat. Nauk 32 (1977), no. 4 (196), 
55-112, 287. Engl, transl. in Russian Math. Surveys 32 (1977), no. 
4, 55-114. MR 57 #6667 

[17] Ja. Sinai: On the concept of entropy for a dynamic system (Rus- 
sian). Dokl. Akad. Nauk SSSR 124 (1959), 768-771. MR 21 #2036a 

[18] Ja. Sinai: Flows with finite entropy (Russian). Dokl. Akad. Nauk 
SSSR 125 (1959), 1200-1202. MR 21 #2036b 

[19] P. Walters: An Introduction to Ergodic Theory, Springer, New 
York, 1982. ISBN 0-387-90599-5, MR 84e:28017 



CHAPTER 3 



Facts from Ergodic Theory 

This chapter simply gathers some necessary background results, 
mostly without proof. 

3.1. Pointwise Ergodic Theorem 

In the proof of Ratner's Theorem (and in many other situations), 
one wants to know that the orbits of a flow are uniformly distributed. 
It is rarely the case that every orbit is uniformly distributed (that is 
what it means to say the flow is uniquely ergodic) , but the Pointwise 
Ergodic Theorem (3.1.3) shows that if the flow is "ergodic," a much 
weaker condition, then almost every orbit is uniformly distributed. 
(See the exercises for a proof.) 

(3.1.1) Definition. A measure-preserving flow ip t on a probability 
space (X, fi) is ergodic if, for each </? t -invariant subset A of 1, we 
have either n{A) = or fJ,(A) = 1. 

(3.1.2) Example. For G = SL(2,R) and T = SL(2,Z), the horocycle 
flow rj t and the geodesic flow j t are ergodic on T\G (with respect to the 
Haar measure on T\G) (see 3.2.7 and 3.2.4). These are special cases of 
the Moore Ergodicity Theorem (3.2.6), which implies that most flows of 
one-parameter subgroups on T\ SL(n, R) are ergodic, but the ergodicity 
of -ft can easily be proved from scratch (see Exer. 3.2#3). 

(3.1.3) Theorem (Pointwise Ergodic Theorem). Suppose 

• (i is a probability measure on a locally compact, separable metric 
space X, 

• (fit is an ergodic, measure-preserving flow on X, and 

• f&L\X,n). 

Copyright © 2003-2005 Dave Witte Morris. All rights reserved. 
Permission to make copies of these lecture notes for educational or scientific use, 
including multiple copies for classroom or seminar teaching, is granted (without 
fee), provided that any fees charged for the copies are only sufficient to recover 
the reasonable copying costs, and that all copies include the title page and this 
copyright notice. Specific written permission of the author is required to reproduce 
or distribute this book (in whole or in part) for profit or commercial advantage. 
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3. Facts from Ergodic Theory 



Then 



Jo Jx 



[ f{ Vt {x))dt^ [ fdfi 



(3.1.4) 



for a.e. x € X. 

(3.1.5) Definition. A point xelis generic for \x if (3.1.4) holds for 
every uniformly continuous, bounded function on X. In other words, a 
point is generic for /j, if its orbit is uniformly distributed in X. 

(3.1.6) Corollary. If <p t is ergodic, then almost every point of X is 
generic for fi. 

The converse of this corollary is true (see Exer. 2). 
Exercises for §3.1. 
#1. Prove Cor. 3.1.6 from Thm. 3.1.3. 

#2. Let ipt be a measure-preserving flow on (X, fi). Show that if <p t is 
not ergodic, then almost no point of X is generic for [i. 

#3. Let 

• ipt be an ergodic measure- preserving flow on (X,/j,), and 

• be a non-null subset of X. 
Show, for a.e. x e X, that 



is unbounded. 

[Hint: Use the Pointwise Ergodic Theorem.] 

#4. Suppose 

• <f>: X —* X is a measurable bijection of X, 

• /i is a 0- invariant probability measure on X, 



Prove the Maximal Ergodic Theorem: for every a € R, if we 



then J E f d\i > a n(E). 

[Hint: Assume a = 0. Let S„(x) = maxo<fc<n Sk(x), and E n = { x \ 
S+ > 0}, so E = U„S n . For x G £„, we have /(x) > S+(x) - 
S+(<P(x)),so J E Jdfx>0.} 
#5. Prove the Pointwise Ergodic Theorem for <p, /i, /, and 5„ as 
in Exer. 4. That is, if <f> is ergodic, show, for a.e. x, that 



• / e i X (X, /i), and 

. S n (a;) = /(x) + /(0(x)) + • • • + 



let 
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[Hint: If {as | lim sup S n {x)/n > a } is not null, then it must be conull, 
by ergodicity. So the Maximal Ergodic Theorem (Exer. 4) implies 
fx / d M > <*•] 

#6. For (j), fi, /, and S n as in Exer. 4, show there is a function /* e 
L X (X, fi), such that: 

(a) for a.e. x, we have lim^oo S n {x)/n — f*(x), 

(b) for a.e. x, we have f*[4>{x)\ — f*{x), and 

(c) j x f* dn = j x f dn. 

(This generalizes Exer. 5, because we do not assume <f> is ergodic.) 
[Hint: For a < f3, replacing X with the ^-invariant set 

X„ ={x\ liminf S„(x)/n < a < f3 < lim sup S„ (x)/n } 

and applying Exer. 4 yields J xf 3 f dfi < a/j,(X^) and f p f dfj, > 

/M*£)-] 
#7. Prove Thm. 3.1.3. 

[Hint: Assume / > and apply Exer. 5 to the function f(x) = 
fif{M*))dt.] 

#8. Prove Lem. 2.6.10. 

[Hint: The Pointwise Ergodic Theorem (6) remains valid if / = /+ — 
/_, with /+ > 0, /_ < 0, and /+ € L (X, fj,), but the limit /* can be 
— oo on a set of positive measure. Applying this to f(x) = ip(xa)—ip(x), 
we conclude that 

Um iKxan-iW 

exists a.e. (but may be — oo). Furthermore, f r \ G f* d/j, = f r \ G / dx. 
Since ^(xa n )/n — > in measure, there is a sequence — + oo, such 
that ip(xa nk )/n k — > a.e. So f*(x) = a.e.] 

#9. Suppose 

• X is a compact metric space, 

• <fi : X — > X is a homeomorphism, and 

• /i is a 0- invariant probability measure on X. 

Show is uniquely ergodic if and only if, for every continuous 
function / on I, there is a constant C, depending on /, such 
that 

1 " 

lim -£/(/(*)) =C, 

k=l 

uniformly over x E X. 
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3.2. Mautner Phenomenon 

We prove that the geodesic flow is ergodic (see Cor. 3.2.4). The 
same methods apply to many other flows on T\G. 

(3.2.1) Definition. Suppose <p t is a flow on a measure space (X, /x), 
and / is a measurable function on X. 

• / is essentially invariant if, for each t £ R, we have /((/? t (x)) = 
f(x) for a.e. x e X. 

• / is essentially constant if /(x) = f(y) for a.e. x,y e X. 

(3.2.2) Remark. It is obvious that any essentially constant function 
is essentially invariant. The converse holds if and only if ipt is ergodic 
(see Exer. 1). 

See Exers. 2 and 3 for the proof of the following proposition and 
the first corollary. 

(3.2.3) Proposition (Mautner Phenomenon). Suppose 

• fi is a probability measure on T\G, 

• / € L 2 (T\G,ii), and 

• u* and a s are one-parameter subgroups of G, 
such that 

• a^ s u t a s = u eSt , 

• li is invariant under both u l and a s , and 

• / is essentially a s -invariant. 
Then f is essentially u -invariant. 

(3.2.4) Corollary. The geodesic flow 7t is ergodic on T\SL(2,R). 

The following corollary is obtained by combining Prop. 3.2.3 with 
Rem. 3.2.2. 

(3.2.5) Corollary. Suppose 

• li is a probability measure on T\G, and 

• u* and a s are one-parameter subgroups of G, 
such that 

• a^ s u t a s = u eH , 

• fj, is invariant under both u* and a s , and 

• (i is ergodic for u . 
Then ji is ergodic for a s . 

The following result shows that flows on T\G are often ergodic. It 
is a vast generalization of the fact that the horocycle flow rj t and the 
geodesic flow j t are ergodic on T\ SL(2,R). 
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(3.2.6) Theorem (Moore Ergodicity Theorem). Suppose 

• G is a connected, simple Lie group with finite center, 

• T is a lattice in G, and 

• g l is a one-parameter subgroup of G, such that its closure {#*} is 
not compact. 

Then g f is ergodic on T\G (w.r.t. the Haar measure on T\G). 

(3.2.7) Corollary. The horocycle flow r\ t is ergodic on T\SL(2,R). 

(3.2.8) Remark. The conclusion of Thm. 3.2.6 can be strengthened: 
not only is g* ergodic on T\G, but it is mixing. That is, if 

• A and B are any two measurable subsets of T\G, and 

• /i is the G-invariant probability measure on T\G, 
then (Ag v ) n B -> (i(A) fi(B) a,st—>oo. 

The following theorem is a restatement of this remark in terms of 
functions. 

(3.2.9) Theorem. If 

• G is a connected, simple Lie group with finite center, 

• r is a lattice in G, 

• li is the G-invariant probability measure on T\G, and 

• g l is a one-parameter subgroup of G, such that {g t } is not com- 
pact, 

then 



for every (f>,ip G L 2 (T\G,fi). 

(3.2.10) Remark. For an elementary (but very instructive) case of the 
following proof, assume G = SL(2,M), and g l — a* is diagonal. Then 
only Case 1 is needed, and we have 



(Note that (U, V) — G; that is, U and V, taken together, generate G.) 

Proof. (Requires some Lie theory and Functional Analysis) 

• Let H = l 1 - be the (closed) subspace of L 2 (T\G,fi) consisting 
of the functions of integral 0. Because the desired conclusion is 
obvious if 4> or is constant, we may assume 4>,ip € 7~L- 

• For each g <G G, define the unitary operator g p on H by 




(<j>gP)(x) = tixg- 1 ). 
• Define (<f> \ tp) = JL G (j)tp d/i. 
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• Instead of taking the limit only along a one-parameter sub- 
group g l , we allow a more general limit along any sequence gj, 
such that <7j — > oo in G; that is, {gj} has no convergent subse- 
quences. 

Case 1. Assume {gj} is contained in a hyperbolic torus A. By pass- 
ing to a subsequence, we may assume g p converges weakly, to some 
operator E; that is, 

((j>g P j | V) - * (<t>E | V 7 ) f° r every <j>,ip e7i. 

Let 

U = {u e G I e} 

and 

V = {t)eG| fl/^flj -» e}. 

For w€ V, we have 

(<K# | V) = lim (0u"^ | V) 

j— >00 

= lim {^{gfvg^P | ^) 
= I™ ((/)g P | V) 

= (4>E I V), 

so v p E = E. Therefore, E annihilates the image of v p — /, for every 
v G V. Now, these images span a dense subspace of the orthogonal 
complement (Ti v )' L of the subspace TC V of elements of Tt that are fixed 
by every element of V. Hence, E annihilates (7i v ) . 
Using * to denote the adjoint, we have 

(<f>E* | V) = (<P | m = lim (0 | = lim (0(. 9 -y | V), 

so the same argument, with E* in the place of E and gj 1 in the place 
of gj, shows that E* annihilates 

Because g p is unitary, it is normal (that is, commutes with its ad- 
joint); thus, the limit E is also normal: we have E* E = EE* . Therefore 

UE\\ 2 = {4>E | <f>E) = (4>(EE*) | 4>) 

= {<j>{E*E)\4>) = {<t>E*\4>E*) = UE*f, 

so ker E = ker E* . Hence 

kcr£ = kcr£ + keriT D (H 1 ') 1 + (W 17 ) 1 - 

= (H v nH u ) ± = (H^) ± . 
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By passing to a subsequence (so {gj} is contained in a single Weyl 
chamber), we may assume (U,V) = G. Then H^ u,v ^ — H G = 0, so 
kcr D 0- 1 = Hence, for all </>, ip £ H, we have 

lim(0 5 ; I V) = (4>E | V) = (0 | V) = 0, 

as desired. 

Case 2. The general case. From the Cartan Decomposition G = KAK, 
we may write gj — c'jdjCj, with c'j,Cj e K and aj e A. Because if is 
compact, we may assume, by passing to a subsequence, that {c^} and 
{a-,} converge: say, — > c' and Cj — ► c. Then 

lim (fa? | V) = lim (</>(c' a,c J ) p | V) 

j — >oo J j — *oo 

= lim (tPic'jya? | V(S T1 )"> 
= lim (<Hc') p < I ^(c -1 )') 
= 0, 

by Case 1. □ 



(3.2.11) Remark. 

1) If 

• G and {<?*} are as in Thm. 3.2.9, 

• T is any discrete subgroup of G, that is not a lattice, and 

• /i is the (infinite) G-invariant measure on r\G, 
then the above proof (with H = L 2 (H\G , //)) shows that 

lim / 0(xg*) ^(a;) d/x = 0, 

for every 0,^ e L 2 (r\G,^). 

2) Furthermore, the discrete subgroup T can be replaced with any 
closed subgroup H of G, such that H\G has a G-invariant mea- 
sure n that is finite on compact sets. 

• If the measure of H\G is finite, then the conclusion is as in 
Thm. 3.2.9. 

• If the measure is infinite, then the conclusion is as in (1). 

Exercises for §3.2. 

#1. Suppose (ft is a flow on a measure space (X,fi). Show that <p t 
is ergodic if and only if every essentially invariant measurable 
function is essentially constant. 
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#2. Prove Prop. 3.2.3 (without quoting other theorems of the text). 
[Hint: We have 

f(xu) = f(xua 3 ) = f{(xa 3 )(a- s ua s )) « f(xa 3 ) = f(x), 
because a~ a ua 3 w e.\ 
#3. Derive Cor. 3.2.4 from Prop. 3.2.3. 

[Hint: If / is essentially (^-invariant, then the Mautner Phenome- 
non implies that it is also essentially u'-invariant and essentially v r - 
invariant.] 

#4. Show that any mixing flow on T\G is ergodic. 

[Hint: Let A — B be a (/'-invariant subset of T\G.] 
#5. Derive Rem. 3.2.8 from Thm. 3.2.9. 
#6. Derive Thm. 3.2.9 from Rem. 3.2.8. 

[Hint: Any L 2 function can be approximated by step functions.] 
#7. Suppose 

• G and {(/'} are as in Thm. 3.2.9, 

• T is any discrete subgroup of G, 

• \x is the G-invariant measure on T\G, 

• 4> € L p (r\G, /x), for some p < oo, and 

• (f> is essentially (/'-invariant. 
Show that <j> is essentially G-invariant. 

[ffini: Some power of (f> is in L 2 . Use Thm. 3.2.9 and Rem. 3.2.11.] 
#8. Let 

• T be a lattice in G = SL(2,R), and 

• \x be a probability measure on T\G. 

Show that if fi is invariant under both a s and u*, then /x is the 
Haar measure. 

[Hint: Let A be the Haar measure on F\G, let U e = { u* < t < e}, 
and define A e and V e to be similar small intervals in {a s }, and {w r }, 
respectively. If / is continuous with compact support, then 

lim / f(xa s )d\(x) = \{yU e A e V e ) f f d\, 

s ^°°JyU e A t V t JT\G 

for all y £ F\G (see Thm. 3.2.9). Because / is uniformly continuous, 
we see that 

/ f(xa s ) d\(x) = / fdX 

JyU c A e V t JyU € A f V € aS 

is approximately 

By choosing y and {sk} such that ya Sk — » y and applying the Pointwise 
Ergodic Theorem, conclude that A = fj,.] 
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3.3. Ergodic decomposition 

Every measure-preserving flow can be decomposed into a union of 
ergodic flows. 

(3.3.1) Example. Let 

• v = (a, 1, 0) G R 3 , for some irrational a, 

• (ft be the corresponding flow on T 3 = R 3 /Z 3 , and 

• \i be the Lebesgue measure on T 3 . 

Then ip t is not ergodic, because sets of the form 4xT 2 are invariant. 

However, the flow decomposes into a union of ergodic flows: for 
each z £ T, let 

• T z = {z} x T 2 , and 

• \i z be the Lebesgue measure on the torus T z . 
Then: 

1) T 3 is the disjoint union |J 2 T z , 

2) the restriction of ipt to each subtorus T z is ergodic (with respect 
to n z ), and 

3) the measure \i is the integral of the measures fj, z (by Fubini's 
Theorem). 

The following proposition shows that every measure fj, can be de- 
composed into ergodic measures. Each ergodic measure /i z is called an 
ergodic component of ^. 

(3.3.2) Proposition. // \i is any (p t -invariant probability measure 
on X, then there exist 

• a measure v on a space Z , and 

• a {measurable) family {n z } z ez of ergodic measures on X , 

such that \i = J z /i z dv; that is, J x f dfi = f z f x f d\i z dv{z), for every 
f£L\X,ii). 

Proof {requires some Functional Analysis). Let A4 be the set of tpt- 
invariant probability measures on X. This is a weak*-compact, convex 
subset of the dual of a certain Banach space, the continuous functions 
on X that vanish at oo. So Choquet's Theorem asserts that any point 
in M is a convex combination of extreme points of M. That is, if we let 
Z be the set of extreme points, then there is a probability measure v 
on Z, such that /i = J z zdv{z). Simply letting fi z = z, and noting 
that the extreme points of M. are precisely the ergodic measures (see 
Exer. 1) yields the desired conclusion. □ 
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The above proposition yields a decomposition of the measure /z, 
but, unlike Eg. 3.3.1, it does not provide a decomposition of the 
space X. However, any two ergodic measures must be mutually sin- 
gular (see Exer. 2), so a little more work yields the following geometric 
version of the ergodic decomposition. This often allows one to reduce 
a general question to the case where the flow is ergodic. 

(3.3.3) Theorem (Ergodic decomposition). If fi is a ip t -invariant prob- 
ability measure on X , then there exist 

• a {measurable) family {^ z }zez of ergodic measures on X, 

• a measure v on Z, and 

• a measurable function tp: X — > Z, 
such that 

1) ji = J z \i z dv, and 

2) [i z is supported on ■ip~ 1 (z), for a.e. z G Z . 

Sketch of Proof. Let T C L x {X,ii) be the collection of {0, l}-valued 
functions that are essentially ^-invariant. Because the Banach space 
L x {X,[i) is separable, we may choose a countable dense subset T a = 
{V'n} of T . This defines a Borel function i[> : X — > {0, 1}°°. (By changing 
each of the functions in Tq on a set of measure 0, we may assume tp 
is cpt-invariant, not merely essentially invariant.) Let Z — {0, 1}°° and 
v = Proposition 3.3.4 below yields a (measurable) family {fi z } z ez 
of probability measures on X, such that (1) and (2) hold. 

All that remains is to show that \x z is ergodic for a.e. zeZ. Thus, 
let us suppose that 

Zbad = { z e Z | \i z is not ergodic } 

is not a null set. For each z e Zbacb there is a {0, l}-valued function f z e 
L X (X, /i z ) that is essentially ^(-invariant, but not essentially constant. 
The functions f z can be chosen to depend measurably on z (this is a 
consequence of the Von Neumann Selection Theorem); thus, there is a 
single measurable function / on Z, such that 

• / = fz a.e.[^ z ] for z e Z blld , and 

• / = 0onZ\ Zbad- 

Because each f z is essentially ^-invariant, we know that / is essentially 
(^-invariant; thus, / € T. On the other hand, / is not essentially 
constant on the fibers of tp, so / is not in the closure of T . This is a 
contradiction. □ 

The above proof relies on the following very useful generalization 
of Fubini's Theorem. 

(3.3.4) Proposition. Let 
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• X and Y be complete, separable metric spaces, 

• fjb and v be probability measures on X and Y, respectively, and 

• ip : X — > Y be a measure-preserving map Borel map. 
Then there is a Borel map A: Y — > Prob(A), such that 

1) M = Iy dv(y)> an d 

2) X y (ip- 1 (y))=l,for allyeY. 
Furthermore, A is unique (up to measure zero). 

Exercises for §3.3. 

#1. In the notation of the proof of Prop. 3.3.2, show that a point \x 
of M. is ergodic if and only if it is an extreme point of A4. (A 
point jU of M is an extreme point if it is not a convex com- 
bination of two other points of A4: that is, if there do not exist 
Mi,M2 € A4, and t £ (0, 1), such that fi = t[i\ + (1 — t)fi 2 and 
Mi 7^ M2-) 

#2. Suppose Hi and ^2 are ergodic, (^-invariant probability measures 
on X. Show that if \i\ ^ fx 2 , then there exist subsets ill and Q 2 
of X, such that, for i, j e {1, 2}, we have 



• X and X' be complete, separable metric spaces, 

• jU and jti' be probability measures on X and X', respectively, 

• ipt and be ergodic, measure-preserving flows on X and X', 
respectively, 

• ip : X — ► y be a measure-preserving map, equivariant Borel 
map, and 

• f2 be a conull subset of X, such that ip^ 1 (y) Dfi is countable, 
for a.c. i/ey. 

Show there is a conull subset fi' of X, such that tp~ 1 (y) n O is 
finite, for a.c. y GY. 

[Hint: The function f(x) = \^i x ) ({a:}) is essentially ^-invariant, so 
it must be essentially constant. A probability measure with all atoms 
of the same weight must have only finitely many atoms.] 



The proof of Ratner's Theorem uses a version of the Pointwise Er- 
godic Theorem that applies to (unipotent) groups that are not just 
one-dimensional. The classical version (3.1.3) asserts that averaging a 




#3. Let 
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function over larger and larger intervals of almost any orbit will con- 
verge to the integral of the function. Note that the average is over 
intervals, not over arbitrary large subsets of the orbit. In the setting of 
higher-dimensional groups, we will average over "averaging sets." 

(3.4.1) Definition. Suppose 

• U is a connected, unipotent subgroup of G, 

• a is a hyperbolic element of G that normalizes U 

• a~ n ua n ->easn^ — oo (note that this is — oo, not oo!), and 

• E is a ball in U (or, more generally, E is any bounded, non-null, 
Borel subset of U). 

Then: 

1) we say that a is an expanding automorphism of U, 

2) for each n > 0, we call E n — a~ n Ea n an averaging set, and 

3) we call {E n }^ =0 an averaging sequence. 

(3.4.2) Remark. 

1) By assumption, conjugating by a n contracts U when n < 0. Con- 
versely, conjugating by a n expands U when n > 0. Thus, E\, 
E 2 , - ■ ■ are larger and larger subsets of U. (This justifies calling a 
an "expanding" automorphism.) 

2) Typically, one takes E to be a nice set (perhaps a ball) that con- 
tains e, with E C a~ 1 Ea. In this case, {En}^^ is an increasing 
F0lner sequence (see Exer. 1), but, for technical reasons, we will 
employ a more general choice of E at one point in our argument 
(namely, in 5.8.7, the proof of Prop. 5.2.4'). 

(3.4.3) Theorem (Pointwise Ergodic Theorem). If 

• U is a connected, unipotent subgroup of G, 

• a is an expanding automorphism of U, 

• Vjj is the Haar measure on U, 

• u is an ergodic U -invariant probability measure on T\G, and 

• f is a continuous function on Y\G with compact support, 

then there exists a U -invariant subset Q of T\G with u(Q) = 1, such 
that 



for every x G fi and every averaging sequence {E n } in U. 

To overcome some technical difficulties, we will also use the follow- 
ing uniform approximate version (see Exer. 3). It is "uniform," because 
the same number TV works for all points x G fi e , and the same set fi e 
works for all functions /. 
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(3.4.4) Corollary (Uniform Pointwisc Ergodic Theorem). If 

• U is a connected, unipotent subgroup of G, 

• a is an expanding automorphism of U, 

• vjj is the Haar measure on U, 

• fi is an ergodic U -invariant probability measure on Y\G, and 

• e > 0, 

then there exists a subset fl e of T\G with fi(fl € ) > 1 — e, such that for 

• every continuous function f on T\G with compact support, 

• every averaging sequence {E n } in U, and 

• every 6 > 0, 

there is some f£N, such that 



YjtTT [ f{xu)dvu{u)- [ f(y)dfi(y) 

\^n) JE n Jr\G 



<S, 



Vu(E n )J L Jy ( ; 

for all x <G O e and all n> N. 

(3.4.5) Remark. 

1) A Lie group G said to be amenable if it has a F0mer sequence. 

2) It is known that a connected Lie group G is amenable if and 
only if there are closed, connected, normal subgroups U and R 
of G, such that 

• U is unipotent, 

• U C R, 

• R/U is abelian, and 

• G/R is compact. 

3) There are examples to show that not every F0lner sequence {E n } 
can be used as an averaging sequence, but it is always the case 
that some subsequence of {E n } can be used as the averaging 
sequence for a pointwise ergodic theorem. 

Exercises for §3.4. 

#1. Suppose 

• U is a connected, unipotent subgroup of G, 

• a is an expanding automorphism of U, 

• vjj is the Haar measure on U, and 

• E is a precompact, open subset of U, such that a~ 1 Ea C E. 
Show that the averaging sequence E n is an increasing F0lner 
sequence; that is, 

(a) for each nonempty compact subset C of U, we have vu((GE n )A 
E n ) /vu{E n ) — > as n — > oo, and 
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(b) E n C E n+ \, for each n. 

#2. Show that if G is amenable, then there is an invariant probability 
measure for any action of G on a compact metric space. More 
precisely, suppose 

• {-En} is a F0mer sequence in a Lie group G, 

• X is a compact metric space, and 

• G acts continuously on X. 

Show there is a G- invariant probability measure on X. 
[Hint: Haar measure restricts to a measure v n on E n . Pushing this 
to X (and normalizing) yields a probability measure [x n on X. Any 
weak*-limit of {fJ, n } is G-invariant.] 

#3. Derive Cor. 3.4.4 from Thm. 3.4.3. 

Notes 

A few of the many introductory books on Ergodic Theory are [7, 
8,23]. 

§3.1. This material is standard. 

The Pointwise Ergodic Theorem is due to G. D. Birkhoff [2]. There 
are now many different proofs, such as [9, 10]. (See also [1, Thm. 1.2.5, 
p. 17]). The hints for Exers. 3.1#4 and 3.1#5 are adapted from [5, 
pp. 19-24]. 

Exercise 3.1#8 is [11, Prop. 2.2]. 

A solution to Excr. 3.1#9 appears in [1, Thm. 1.3.8, p. 33]. 

§3.2. The Moore Ergodicity Theorem (3.2.6) was first proved by 
C. C. Moore [16]. Later, he [17] extended this to a very general version 
of the Mautner Phenomenon (3.2.3). 

Mixing is a standard topic (see, e.g., [8, 23] and [1, pp. 21-28].) 
Our proof of Thm. 3.2.9 is taken from [3]. Proofs can also be found in 
[1, Chap. 3], [13, §11.3] and [24, Chap. 2]. 

A solution to Exer. 3.2#1 appears in [1, Thm. 1.1.3, p. 3]. 

The hint to Exer. 3.2#8 is adapted from [14, Lcm. 5.2, p. 31]. 

§3.3. This material is standard. 

A complete proof of Prop. 3.3.2 from Choquet's Theorem appears 
in [19, §12]. 

See [18, §8] for a brief history (and proof) of the ergodic decompo- 
sition (3.3.3). 

Proposition 3.3.4 appears in [20, §3]. 

Exer. 3.3#1 is solved in [1, Prop. 3.1, p. 30]. 
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§3.4. For any amenable Lie group, a theorem of A. Tempclman 
[21], generalized by W. R. Emerson [4], states that certain F0lner se- 
quences can be used as averaging sequences in a pointwisc ergodic the- 
orem. (A proof also appears in [22, Cor. 6.3.2, p. 218].) The Uniform 
Pointwise Ergodic Theorem (3.4.4) is deduced from this in [15, §7.2 
and §7.3]. 

The book of Greenleaf [6] is the classic source for information on 
amenable groups. 

The converse of Exer. 3.4#2 is true [6, Thm. 3.6.2]. Indeed, the exis- 
tence of invariant measures is often taken as the definition of amenabil- 
ity. See [24, §4.1] for a discussion of amenable groups from this point 
of view, including the characterization mentioned in Rem. 3.4.5(2). 

Remark 3.4.5(3) is a theorem of E. Lindcnstrauss [12]. 
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CHAPTER 4 



Facts about Algebraic Groups 

In the theory of Lie groups, all homomorphisms (and other maps) 
are generally assumed to be C°° functions (see §4.9). The theory of 
algebraic groups describes the conclusions that can be obtained from 
the stronger assumption that the maps are polynomial functions (or, at 
least, rational functions). Because the polynomial nature of unipotent 
flows plays such an important role in the arguments of Chapter 1 (see, 
for example, Prop. 1.5.15), it is natural to expect that a good under- 
standing of polynomials will be essential at some points in the more 
complete proof presented in Chapter 5. However, the reader may wish 
to skip over this chapter, and refer back when necessary. 

4.1. Algebraic groups 

(4.1.1) Definition. 

• We use . . . ,xe,e] to denote the set of real polynomials in 
the £ 2 variables { Xij | 1 < i, j < I }. 

• For any Q <E M[xi t i, . . . , x^t\, and any n x n matrix g, we use 
Q{g) to denote the value obtained by substituting the matrix 
entries t^j into the variables Xij. For example: 

o If Q = xi t i + X2.2 + • • • + ££.£, then Q(g) is the trace of g. 

o If Q = Xi t \X2,2 — £1,2X2,1, then Q(g) is the determinant of 
the first principal 2x2 minor of g. 

• For any subset Q of R^i, . . . , X£^], let 

Var(Q) = {ge SL(£,R) \ Q(g) = 0, VQ E Q}. 
This is the variety associated to Q. 

Copyright © 2003-2005 Dave Witte Morris. All rights reserved. 
Permission to make copies of these lecture notes for educational or scientific use, 
including multiple copies for classroom or seminar teaching, is granted (without 
fee), provided that any fees charged for the copies are only sufficient to recover 
the reasonable copying costs, and that all copies include the title page and this 
copyright notice. Specific written permission of the author is required to reproduce 
or distribute this book (in whole or in part) for profit or commercial advantage. 
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• A subset H of SL(^,R) is Zariski closed if there is a subset Q 
of K^i,!, . . . ,X£ t e], such that H = Var(Q). (In the special case 
where H is a subgroup of SL(£, R), we may also say that H is a 
real algebraic group or an algebraic group that is defined 
over R.) 

(4.1.2) Example. Each of the following is a real algebraic group (see 
Excr. 2): 



1) SL(£,R). 

2) The group 











of diagonal matrices in SL(£, R). 
3) The group 

"1 







C SL(£, 



C SL(£,R) 



* lj 

of lower-triangular matrices with l's on the diagonal. 

4) The group 

^ o 

of lower-triangular matrices in SL(i?,R). 

5) The copy of SL(n, R) in the top left corner of SL(£, R) (if n<£). 

6) The stabilizer 

Stab SL(w (v) = { g e SL(*, R) | vg = v } 

of any vector v € R^. 

7) The stabilizer 

Stab SL(w (TO = { g e SL& R) | Vu e V, e 7} 

of any linear subspace V of R £ . 

8) The special orthogonal group SO(Q) of a quadratic form Q on R £ 
(see Defn. 1.2.1). 

It is important to realize that most closed subsets of SL(f,R) are 
not Zariski closed. In particular, the following important theorem tells 
us that an infinite, discrete subset can never be Zariski closed. (It is a 
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generalization of the fact that any nontrivial polynomial function on R 
has only finitely many zeroes.) We omit the proof. 

(4.1.3) Theorem (Whitney). Any Zariski closed subset of SL(£, R) 
has only finitely many components (with respect to the usual topology 
of SL(£, R) as a Lie group). 

(4.1.4) Example. From Thm. 4.1.3, we know that the discrete group 
SL(£, Z) is not Zariski closed. In fact, we will see that SL(£, Z) is 
not contained in any Zariski closed, proper subgroup of SL(£, R) (see 
Exer. 4.7#1). 

(4.1.5) Remark. Zariski closed sets need not be submanifolds of 
SL(£, R). This follows from Exer. 4, for example, because the union 
of two submanifolds that intersect is usually not a submanifold — the 
intersection is a singularity. 

Exercise 10 defines the dimension of any Zariski closed set Z. 
Although we do not prove this, it can be shown that (if Z is nonempty), 
there is a unique smallest Zariski closed subset S of Z, such that 

• dim S < dimZ, 

• Z \ S is a C°° submanifold of SL^, R), and 

• dim Z (as defined below) is equal to the dimension of Z \ S as a 
manifold. 

The set S is the singular set of Z. From the uniqueness of S, it follows 
that any Zariski closed subgroup of SL(£, R) is a C°° submanifold of 
SL(£,R) (see Exer. 5); 

Exercises for §4.1. 

#1. Show that every Zariski closed subset of SL(£, R) is closed (in the 

usual topology of SL(£, R) as a Lie group). 
#2. Verify that each of the groups in Eg. 4.1.2 is Zariski closed. 

[Hint: (1) Let Q = 0. (2) Lot Q = { Xi ,j \ i ^ j}. (6) Let Q = 

{ vixij H + vexe tj — Vj \ 1 < j < £}, where v = (vi, ... , v e ).] 

#3. Show that if Z is a Zariski closed subset of SL(£, R), and g £ 

SL(£, R), then Zg is Zariski closed. 
#4. Suppose Zi and Z 2 are Zariski closed subsets of SL(£, R). Show 

that the union Z\ U Z 2 is Zariski closed. 
#5. Show that if G is a Zariski closed subgroup of SL(£, R), then G 

is a C°° submanifold of SL(£, R) (so G is a Lie group). 

[Hint: Uniqueness of the singular set S (see 4.1.5) implies Sg = S for 

all g e G, so S = 0.] 

The remaining exercises present some (more technical) informa- 
tion about Zariski closed sets, including the notion of dimension. 
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#6. For any subset Z of SL(£,M.), let S{Z) be the collection of poly- 
nomials that vanish on Z; that is, 

S{Z) = { Q e R[x hl , . . . , x e ,e] \Vz e Z, Q(z) = }. 

(a) Show Z is Zariski closed if and only if Z = Var(<S(Z)). 

(b) Show that S{Z) is an ideal; that is, 

(i) G S(Z), 

(ii) for all Qi,Q 2 & S(Z), we have Q 1 + Q 2 G S(Z), and 

(iii) for all Q\ G S{Z) and Q 2 G R[#i,i, . . . , xe y e], we have 
QiQ2^S(Z). 

#7. Recall that a ring R is Noetherian if it has the ascending chain 
condition on ideas; this means that if I\ C I 2 C • • • is any in- 
creasing chain of ideals, then we have /„ = I n+ \ = ■ ■ ■ for some n. 

(a) Show that a commutative ring R is Noetherian if and only if 
all of its ideals are finitely generated; that is, for each ideal / 
of R, there is a finite subset F of /, such that I is the 
smallest ideal of R that contains F. 

(b) Show that Rfx^i, . . . , xe,e] is Noetherian. 

(c) Show that if Z is a Zariski closed subset of SL(^,R), then 
there is a finite subset Q of R^i, . . . , xe,e], such that Z = 
Var(Q). 

(d) Prove that the collection of Zariski closed subsets of $L(l , R) 
has the descending chain condition: if Z\ D Zi D • • • is a 
decreasing chain of Zariski closed sets, then we have Z n = 
Z n+ i = ■ ■ ■ for some n. 

[Hint: (7b) Show that if R is Noetherian, then the polynomial ring 
R[x] is Noetherian: If I is an ideal in R[x], let 

In = {r € R I 3Q G rx" + Q e I and degQ < n}. 

Then /„ C C • • • is an increasing chain of ideals.] 

#8. A Zariski closed subset of SL(£, R) is irreducible if it is not the 
union of two Zariski closed proper subsets. 
Let Z be a Zariski closed subset of SL(£, R). 

(a) Show that Z is the union of finitely many irreducible Zariski 
closed subsets. 

(b) An irreducible component of Z is an irreducible Zariski 
closed subset of Z that is not not properly contained in any 
irreducible Zariski closed subset of Z . 

(i) Show that Z is the union of its irreducible components. 

(ii) Show that Z has only finitely many irreducible compo- 
nents. 
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[Hint: (8a) Proof by contradiction: use the descending chain condition. 
(8b) Use (8a).] 

#9. Suppose G is a Zariski closed subgroup of SL(£, K). 

(a) Show that the irreducible components of G are disjoint. 

(b) Show that the irreducible components of G are cosets of a 
Zariski closed subgroup of G. 

#10. The dimension of a Zariski closed set Z is the largest r, such 
that there is a chain Z n C Zi C • • • C Z r of nonempty, irreducible 
Zariski closed subsets of Z. 

It can be shown (and you may assume) that dim Z is the 
largest r, for which there is a linear map T: Mat£ X ^(R) — ► R r , 
such that T(Z) contains a nonempty open subset of M r . 

(a) Show dim Z = if and only if Z is finite and nonempty. 

(b) Show dimZi < dimZ2 if Z\ C Z 2 . 

(c) Show dim(Zx D Z 2 ) = maxjdim Z±, dim Z 2 } if Zi and Z 2 are 
Zariski closed. 

(d) Show dimSL(^,M) = ( 2 - 1. 

(c) Show that the collection of irreducible Zariski closed subsets 
of SL(^,M) has the ascending chain condition: if Z\ C Z 2 C 
• • • is an increasing chain of irreducible Zariski closed sets, 
then we have Z n — Z n+ \ = ■ ■ ■ for some n. 

#11. Suppose V and PU are Zariski closed sets in SL(^,M). Show that 
if 

• VcW, 

• W is irreducible, and 

• dim V = dim W, 

then V = W. 

4.2. Zariski closure 

(4.2.1) Definition. The Zariski closure of a subset of SL(£, K) is 
the (unique) smallest Zariski closed subset of SL(£, R) that contains iJ 
(see Exer. 1). We use H to denote the Zariski closure of H. 

(4.2.2) Remark. 

1) Obviously, H is Zariski closed if and only if H = H. 

2) One can show that if H is a subgroup of SL(^, M), then H is also 
a subgroup of SL(^,M) (see Exer. 4.3#11). 

Every Zariski closed subgroup of SL(£, E) is closed (see Exer. 4.1#1) 
and has only finitely many connected components (see 4.1.3). The con- 
verse is false: 
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(4.2.3) Example. Let 

t G K+ | c SL(2,R). 

Then 

1) A is closed, 

2) A is connected (so it has only one connected component), and 
t€Rv{0}} (sec Excr. 3). 

So A = R \ {0} has two connected components. Since A ^ A, we know 
that A is not Zariski closed. 

Although A is not exactly equal to A in Eg. 4.2.3, there is very 
little difference: A has finite index in A. For most purposes, a finite 
group can be ignored, so we make the following definition. 

(4.2.4) Definition. A subgroup H of SL(£, R) is almost Zariski 
closed if H is a finite-index subgroup of H. 

(4.2.5) Remark. Any finite-index subgroup of a Lie group is closed 
(see Exer. 5), so any subgroup of SL(£, R) that is almost Zariski closed 
must be closed. 

The reader may find it helpful to have some alternative character- 
izations (see Exer. 6): 

(4.2.6) Remark. 

1) A connected subgroup H of SL(£, R) is almost Zariski closed if 
and only if it is the identity component of a subgroup that is 
Zariski closed. 

2) A subgroup H of SL(£, R) is almost Zariski closed if and only if 
it is the union of (finitely many) components of a Zariski closed 
group. 

3) Suppose H has only finitely many connected components. Then 
H is almost Zariski closed if and only if its identity component 
H° is almost Zariski closed. 

4) Suppose H is a Lie subgroup of SL(£, R). Then H is almost Zariski 
closed if and only if dimi? = dim_ff. 

Note that if H is almost Zariski closed, then it is closed, and has 
only finitely many connected components. Here are two examples to 
show that the converse is false. (Both examples are closed and con- 
nected.) Corollary 4.6.8 below implies that all examples of this phe- 
nomenon must be based on similar constructions. 



A = 



t 
l/t 



3) A 



t 
l/t 



4-2. Zariski closure 



123 



(4.2.7) Example. 

1) For any irrational number a, let 

t a 
t 
l/t a+1 




tel + } c SL(3,R). 



s 
t 
l/(si) 



s,tei \ {0} ^ . 



Since dim T = 1 7^ 2 = dim T, we conclude that T is not almost 
Zariski closed. 

The calculation of T follows easily from Cor. 4.5.4 below. In- 
tuitively, the idea is simply that, for elements g of T, the relation 
between g\ t \ and 32,2 is transcendental, not algebraic, so it can- 
not be captured by a polynomial. Thus, as far as polynomials 
are concerned, there is no relation at all between \ and (72,2 — 
they can vary independently of one another. This independence 
is reflected in the Zariski closure. 



2) Let 



H = 



Then 



H 



V 














e"* 














1 


t 











1 


e s 








0" 





e- s 














1 


t 











1 



t £ 



> C SL(4, 



s.t £ 



> C SL(4, 



Since dim_ff = 1^2 = dim H, we conclude that H is not almost 
Zariski closed. 

Formally, the fact that H is not almost- Zariski closed follows 
from Thm. 4.5.4 below. Intuitively, the transcendental relation 
between g\ t \ and g^A is lost in the Zariski closure. 

Exercises for §4.2. 

#1. For each subset H of SL(£, M), show there is a unique Zariski 

closed subset H of SL(f,R) containing H, such that if C is any 

Zariski closed subset H of SL(^, R) that contains H , then H C C. 
[Hint: Any intersection of Zariski closed sets is Zariski closed.] 
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#2. Show that if Z is any subset of an algebraic group G, then the 
centralizer Cg(Z) is Zariski closed. 

#3. Verify 4.2.3(3). 

[Hint: Let Q — {21,2, £2,1 }. If Q(xi : i, X2.2) is a polynomial, such that 
Q(t, 1/t) = for all t > 0, then Q(t, 1/t) = for all t G R.] 

#4. Show that if H is a connected subgroup of SL(^, R), then H is 
irreducible. 

#5. {Requires some Lie theory) Suppose if is a finite-index subgroup 
of a Lie group G. Show that H is an open subgroup of G. (So H 
is closed.) 

[Hint: There exists n G Z + , such that g n £ H for all g £ G. Therefore 
exp(a;) = exp((l/n)a:) n G H for every element x of the Lie algebra 
of G] 

#6. Verify each part of Rem. 4.2.6. 

4.3. Real Jordan decomposition 

The real Jordan decomposition writes any matrix as a combination 
of matrices of three basic types. 

(4.3.1) Definition. Let g G SL(f,R). 

• g is unipotent if 1 is the only eigenvalue of g (over C); in other 
words, (g - if = (see 1.1.7). 

• g is hyperbolic (or M.- split) if it is diagonalizable over R, and 
all of its eigenvalues are positive; that is, if h~ 1 gh is a diagonal 
matrix with no negative entries, for some h G SL(£, R). 

• g is elliptic if it is diagonalizable over C, and all of its eigenvalues 
are of absolute value 1. 

(4.3.2) Example. For all t G R: 
"1 0" 



1) 
2) 
3) 



t 1 

e* 
e" 

cos i sin i 
— sin t cos £ 



is unipotent, 



is hyperbolic, 



is elliptic (see Exer. 2). 



See Exer. 3 for an easy way to tell whether an element of SL(2,R) is 
unipotent, hyperbolic, or elliptic. 

(4.3.3) Proposition (Real Jordan decomposition). For any g G SL(£, R) 
there exist unique g u ,gh,g e & SL(^,R), such that 

1) 5 = 9ughg e , 

2) g u is unipotent, 
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3) gh is hyperbolic, 

4) g e is elliptic, and 

5) g u , gh, and g e all commute with each other. 

Proof. (Existence) The usual Jordan decomposition of Linear Al- 
gebra (also known as "Jordan Canonical Form") implies there exist 
h € SL(£, C), a nilpotent matrix N, and a diagonal matrix D, such 
that h~ x gh = N + D, and N commutes with D. This is an additive 
decomposition. By factoring out D, we obtain a multiplicative decom- 
position: 

h~ x gh = {ND- 1 + I)D = uD, 

where u — ND^ 1 + 1 is unipotcnt (because u — I = ND^ 1 is nilpotent, 
since N commutes with D^ 1 ). 

Now, because any complex number z has a (unique) polar form 
z = re 10 , we may write D = DhD e , where Dh is hyperbolic, D e is 
elliptic, and both matrices are diagonal, so they commute with each 
other (and, from the structure of the Jordan Canonical Form, they 
both commute with iV). Conjugating by h~ x , we obtain 

g = h{uD h D e )h- x = g u g h g e , 

where g u = huh^ 1 , gh = hDhh^ 1 , and g e = hD e h~ x . This is the 
desired decomposition. 

(Uniqueness) The uniqueness of the decomposition is, perhaps, not 
so interesting to the reader, so we relegate it to the exercises (see Ex- 
crs. 5 and 6). Uniqueness is, however, often of vital importance. For 
example, it can be used to address a technical difficulty that was ig- 
nored in the above proof: from our construction, it appears that the 
matrices g Ul gh, and g e may have complex entries, not real. However, 
using an overline to denote complex conjugation, we have g — g u gh ~gl- 
Since g = g, the uniqueness of the decomposition implies ~g~Z = g u , 
gK = gh, and ~g~Z — g e . Therefore, g u , gh,g& € SL(£, R), as desired. □ 

The uniqueness of the Jordan decomposition implies, for g, h e 
SL(£, R), that if g commutes with h, then the Jordan components g u , 
gh, and g e commute with h (see also Exer. 5). In other words, if the 
centralizer Csl(«)(M contains g, then it must also contain the Jor- 
dan components of g. Because the centralizer is Zariski closed (see 
Exer. 4.2#2), this is a special case of the following important result. 

(4.3.4) Theorem. If 

• G is a Zariski closed subgroup of SL(£, R), and 

• geG, 

then g u ,g h ,g e & G. 
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We postpone the proof to §4.5. 

As mentioned at the start of the chapter, we should assume that 
homomorphisms are polynomial functions. (But some other types of 
functions will be allowed to be more general rational functions, which 
are not defined when the denominator is 0.) 

(4.3.5) Definition. Let H be a subset of SL(f,R). 

1) A function <f>: H — > R is a polynomial (or is regular) if there 
exists Q <E . . . , xe t e], such that <j>(h) = Q(h) for all h £ H. 

2) A real-valued function tjj defined a subset of H is rational if 
there exist polynomials <pi,<p2- H — ► R, such that 

(a) the domain of ip is { h e H \ faih) ^ }, and 

(b) tp(h) = 4*i{h) / 4> 2 {h) for all h in the domain of ip. 

3) A function (j>: H — > SL(n, R) is a polynomial if, for each 1 < 
i,3 < the matrix entry <p{h)ij is a polynomial function of 
h G H. Similarly, V is rational if each ijj(h)i,j is a rational 
function of /i e iJ. 

We now show that any polynomial homomorphism respects the 
real Jordan decomposition; that is, p(g u ) = p(g) u , p(9h) — p(g)h, and 

P(9e) = P{g)e- 

(4.3.6) Corollary. Suppose 

• G is a real algebraic group, and 

• p: G — > SL(m, R) is a polynomial homomorphism. 
Then: 

1) If u is any unipotent element ofG, then p{u) is a unipotent ele- 
ment of SL(m, R). 

2) If a is any hyperbolic element of G, then p(a) is a hyperbolic 
element of SL(m, R). 

3) If k is any elliptic element of G, then p(k) is an elliptic element 
of SL(m,R). 

Proof. Note that the graph of p is a Zariski closed subgroup ofGxff 
(see Exer. 15). 

We prove only (1); the others are similar. Since u is unipotent, 
we have u u = u, Uh — e, and u e = e. Therefore, the real Jordan 
decomposition of (u, p(u)) is 

(u,p(u)) = (u,p(u) u )(e,p(u) h )(e,p(u) e ). 

Since (u, p(u)) G graph p, Thm. 4.3.4 implies 

(u,p(u) u ) = (u,p(u)) u e graphp. 
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Let y = p(u) u . Since (u,y) G graph/?, we have p{u) = y. Hence p{u) = 
p(u) u is unipotent. □ 



Exercises for §4.3. 

#1. Show that every element of Vg is unipotent 
cos t sin t 
— sin t cos t 



is an elliptic element of SL(2,R), for every 



#2. Show 

t e R. 

#3. Let 5 e SL(2,R). Recall that tracer is the sum of the diagonal 
entries of g. Show: 

(a) g is unipotent if and only if trace g = 2. 

(b) g is hyperbolic if and only if trace g > 2. 

(c) g is elliptic if and only if — 2 < trace g < 2. 

(d) g is neither unipotent, hyperbolic, nor elliptic if and only if 
trace g < —2. 

#4. Suppose g and h are elements of SL(^, R), such that gh = hg. 
Show: 

(a) If g and h are unipotent, then gh is unipotent. 

(b) If g and h are hyperbolic, then gh is hyperbolic. 

(c) If g and h are elliptic, then gh is elliptic. 

#5. Suppose g,g u ,gh,g e € SL(£, C), and these matrices are as de- 
scribed in the conclusion of Prop. 4.3.3. Show (without us- 
ing the Jordan decomposition or any of its properties) that if 
x G SL(£, C), and x commutes with g, then x also commutes 
with each of g u , g h , and g e . 

[Hint: Passing to a conjugate, assume gh and g e are diagonal. We have 
9h nx 9h = (9uge) n x(g u g e )~ n . Since each matrix entry of the LHS is an 
exponential function of n, but each matrix entry on the RHS grows at 
most polynomially, we see that the LHS must be constant. So x com- 
mutes with g h . Then gZ n xgZ = g^xg^ n . Since a bounded polynomial 
must be constant, we see that x commutes with g u and g e ] 

#6. Show that the real Jordan decomposition is unique. 

[Hint: If g = g u g h g £ = gWhg'e, then g^g'u = ghgeigWe)' 1 is both 
unipotent and diagonalizable over C (this requires Exer. 5). Therefore 
9u = g' u - Similarly, g h = g' h and g e = g' e ] 

#7. Suppose g e SL(£ , R),v£ R e , and v is an eigenvector for g. Show 
that v is also an eigenvector for g u , gh, and g e . 
[Hint: Let W be the eigenspace corresponding to the eigenvalue A 
associated to v. Because g u , gh, and g e commute with g, they pre- 
serve W . The Jordan decomposition of g\w, the restriction of g to W, 

is {g\w)u(g\w)h(g\w)e] 
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#8. Show that any commuting set of diagonalizable matrices can be 
diagonalized simultaneously. More precisely, suppose 

• S C SL(£,R), 

• each s G S is hyperbolic, and 

• the elements of S all commute with each other. 

Show there exists h € SL(£, R), such that every element of h~ 1 Sh 
is diagonal. 

#9. Suppose G is an subgroup of SL(i?,R) that is almost Zariski 
closed. 

(a) For i(g) — g -1 , show that i is a polynomial function from G 
to G. 

(b) For m(g, h) — gh, show that m is a polynomial function 
from G x G to G. (Note that G x G can naturally be realized 
as a subgroup of SL(2£, R) that is almost Zariski closed.) 

[Hint: Cramer's Rule provides a polynomial formula for the inverse of 
a matrix of determinant one. The usual formula for the product of two 
matrices is a polynomial.] 

#10. Show that if 

• /: SL(^,R) — > SL(to,R) is a polynomial, and 

• H is a Zariski closed subgroup of SL(m,R), 
then is Zariski closed. 

#11. Show that if H is any subgroup of SL(i?,R), then H is also a 
subgroup of SL(£,R). 
[Hint: Exercises 9 and 10.] 

#12. Show that if H is a connected Lie subgroup of SL(£, R), then the 
normalizer N§uem(H) is Zariski closed. 

[Hint: The homomorphism Ad: SL(£,R) -> SL(st(£,E)) is a polyno- 
mial.] 

#13. Show that if G is any connected subgroup of SL(£, R), then G is 
a normal subgroup of G. 

#14. There is a natural embedding of SL(£,R) x SL(m,R) in SL(^ + 
m,R). Show that if G and H are Zariski closed subgroups of 
SL(£, R) and SL(m,R), respectively, then G x H is Zariski closed 
in SL(^ + m,R). 

#15. Suppose G is a Zariski closed subgroup of SL(f,R), and p: G — > 
SL(m, R) is a polynomial homomorphism. There is a natural em- 
bedding of the graph of p in SL(£ + m,R) (cf. Exer. 14). Show 
that the graph of p is Zariski closed. 
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4.4. Structure of almost- Zariski closed groups 

The main result of this section is that any algebraic group can be 
decomposed into subgroups of three basic types: unipotent, torus, and 
semisimple (see Thm. 4.4.7). 

(4.4.1) Definition. 

• A subgroup U of SL(£, R) is unipotent if and only if it is conju- 
gate to a subgroup of . 

• A subgroup T of SL(£, R) is a torus if 

o T is conjugate (over C) to a group of diagonal matrices; that 
is, h~ 1 Th consists entirely of diagonal matrices, for some 
h G SL(*,C)), 

o T is connected, and 

o T is almost Zariski closed. 

(We have required tori to be connected, but this requirement 

should be relaxed slightly; any subgroup of T that contains T 
may also be called a torus.) 

• A closed subgroup L of SL(£, R) is semisimple if its identity 
component L° has no nontrivial, connected, abelian, normal sub- 
groups. 

(4.4.2) Remark. Here are alternative characterizations of unipotent 
groups and tori: 

1) (Engel's Theorem) A subgroup U of SL(£, R) is unipotent if and 
only if every element of U is unipotent (sec Excr. 5). 

2) A connected subgroup T of SL(£, R) is a torus if and only if 

• T is abelian, 

• each individual element of T is diagonalizable (over C), and 

• T is almost Zariski closed 
(see Excr. 4.3#8). 

Unipotent groups and tori are fairly elementary, but the semisimple 
groups are more difficult to understand. The following fundamental 
theorem of Lie theory reduces their study to simple groups (which 
justifies their name). 

(4.4.3) Definition. A group G is almost simple if it has no infinite, 
proper, normal subgroups. 

(4.4.4) Theorem. Let L be a connected, semisimple subgroup of SL(^,R). 
Then, for some n, there are closed, connected subgroups Si, . . . , S n of L, 
such that 

1) each Si is almost simple, and 
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2) L is isomorphic to (a finite cover of) Si x • • • x S n . 

The almost-simple groups have been classified by using the the- 
ory known as "roots and weights." We merely provide some typical 
examples, without proof. 

(4.4.5) Example. 

1) SL(£,R) is almost simple (if £ > 2). 

2) If Q is a quadratic form on M. 1 that is nondegenerate (see 
Defn. 1.2.1), and £ > 3, then SO(Q) is semisimple (and it is 
almost simple if, in addition, n ^ 4), (For £ = 2, the groups 
SO(2) and SO(l, 1) are tori, not semisimple (see Exer. 1).) 

From the above almost-simple groups, it is easy to construct nu- 
merous semisimple groups. One example is 

SL(3,R) x SL(7,R) x SO(6) x SO(4,7). 

The following structure theorem is one of the major results in the 
theory of algebraic groups. 

(4.4.6) Definition. Recall that a Lie group G is a semidirect product 

of closed subgroups A and B (denoted G = A x B) if 

1) G = AB, 

2) B is a normal subgroup of G, and 

3) AC\B = {e}. 

(In this case, the map (a, b) ab is a diffcomorphism from A x B 
onto G. However, it is not a group isomorphism (or even a homomor- 
phism) unless every element of A commutes with every element of B.) 

(4.4.7) Theorem. Let G be a connected subgroup of SL(£, R) that is 
almost Zariski closed. Then there exist: 

• a semisimple subgroup L of G, 

• a torus T in G, and 

• a unipotent subgroup U of G, 
such that 

1) G = (LT) x U, 

2) L, T , and U are almost Zariski closed, and 

3) L and T centralize each other, and have finite intersection. 

Sketch of proof (requires some Lie theory). Let R be the radical of G, 
and let L be a Levi subgroup of G; thus, R is solvable, L is semisimple, 
LR = G, and L n R is discrete (see 4.9.15). From the Lie-Kolchin 
Theorem (4.9.17), we know that R is conjugate (over C) to a group of 
lower-triangular matrices. By working in SL(£, C), let us assume, for 
simplicity, that R itself is lower triangular. That is, iJcl( kU(. 
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Let 7r : ©£ k — > DV be the natural projection. It is not difficult to 
see that there exists r e R, such that n(R) C (n(r)) (by using (4.4.12) 
and (4.5.4)). Let 

T = (r s r e ) and U = RCi U e . 

Because ir(r s r e ) = 7r(r), we have tt(R) C tt(T), so, for any g G R, there 
exists t £ T, such that 7r(t) = 7r(g). Then ir(t~ 1 g) = e, so t~ x g e £/. 
Therefore g £tU CT x U. Since g & R is arbitrary, we conclude that 

R = T k [/. 

This yields the desired decomposition G = (£T) x U. □ 

(4.4.8) Remark. The subgroup U of (4.4.7) is the unique maximal 
unipotcnt normal subgroup of G. It is called the unipotent radical 
of G. 

It is obvious (from the Jordan decomposition) that every element 
of a compact real algebraic group is elliptic. We conclude this section by 
recording (without proof) the fact that this characterizes the compact 
real algebraic groups. 

(4.4.9) Theorem. An almost- Zariski closed subgroup of SL(£, K) is 
compact if and only if all of its elements are elliptic. 

(4.4.10) Corollary. 

1) A nontrivial unipotent subgroup U of SL(£, M.) is never compact. 

2) A torus T in SL(£,R) is compact if and only if none of its non- 
trivial elements are hyperbolic. 

3) A connected, semisimple subgroup L of 3L(£, K) is compact if 
and only if it has no nontrivial unipotent elements (also, if and 
only if it has no nontrivial hyperbolic elements) . 

We conclude this section with two basic results about tori. 

(4.4.11) Definition. A torus T is hyperbolic (or R-split) if every 
element of T is hyperbolic. 

(4.4.12) Corollary. Any connected torus T has a unique decomposition 
into a direct product T = Th x T c , where 

1) Th is a hyperbolic torus, and 

2) T c is a compact torus. 

Proof. Let 

T h = { g e T | g is hyperbolic } 

and 

T c = {g e T | g is elliptic}. 
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Because T is abelian, it is easy to see that Th and T c are subgroups of T 
(see Exer. 4.3#4). It is immediate from the real Jordan decomposition 
that T = T h x T c . 

All that remains is to show that Th and T c are almost Zariski closed. 

(Th) Since Th is a set of commuting matrices that are diagonalizablc 
over R, there exists h G SL(£,E), such that h~ x T h h C V>£ (see 
Exer. 4.3#8). Hence, T h = Tn(/iD^/i _1 ) is almost Zariski closed. 

(T e ) Let 

H> e be the group of diagonal matrices in SL(£, C), 

and 



C = { g e D? 



every eigenvalue of 5 
has absolute value 1 



Because T is a torus, there exists /i € SL(£, C), such that 
h- 1 Th C Bf. Then T c = T n hCh' 1 is compact. So it is Zariski 
closed (see Prop. 4.6.1 below). □ 

A (real) representation of a group is a homomorphism into 
SL(m, R), for some m. The following result provides an explicit de- 
scription of the representations of any hyperbolic torus. 

(4.4.13) Corollary. Suppose 

• T is a (hyperbolic) torus that consists of diagonal matrices in 
SL(£,R), and 

• p: T — > SL(m, E) is any polynomial homomorphism. 
Then there exists h € SL(n, E), such that, letting 

p' (t) = h- 1 p(i) h forteT, 

we have: 

1) p'(T) C D m; and 

2) For each j with 1 < j < m, there are integers m,...,ne, such 
that 

n'(t) ■ — t ni t n2 ■ ■ ■ t nt 

for all t s T . 

Proof. (1) Since p(T) is a set of commuting matrices that are diago- 
nalizable over E, there exists h £ SL(m, E), such that h~ 1 p(T)h C D TO 
(see Exer. 4.3#8). 

(2) For each j, p'(t)jj defines a polynomial homomorphism from T 
to R + . With the help of Lie theory, it is not difficult to see that any 
such homomorphism is of the given form (see Exer. 4.9#6). □ 
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cosh t sinh t 
sinh t cosh t 



Exercises for §4.4. 

#1. Show: 

(a) SO (2) is a compact torus, and 

(b) S0(1, 1)° is a hyperbolic torus. 
[Hint: We have 

SO(2) = ([ COS ^ sine J}andSO(l,l) = 
w ^ |_— sin 6 cos 9 J J v ' 

where cosht = (e* + e~*)/2 and sinht = (e* - e~')/2.] 
#2. Show: 

(a) The set of unipotent elements of SL(£, K) is Zariski closed. 

(b) If U is a unipotent subgroup of SL(£, R), then U is also unipo- 
tent. 

#3. Prove the easy direction (=>) of Thm. 4.4.9. 

#4. Assume that Thm. 4.4.9 has been proved for semisimple groups. 
Prove the general case. 
[Hint: Use Thm. 4.4.7.] 

#5. (Advanced) Prove Engel's Theorem 4.4.2(1). 

[Hint: (<=) It suffices to show that U fixes some nonzero vector v. (For 
then we may consider the action of U on R £ /Rt>, and complete the proof 
by induction on £.) There is no harm in working over C, rather than R, 
and we may assume there are no [/-invariant subspaces of C . Then 
a theorem of Burnside states that every £ x £ matrix M is a linear 
combination of elements of U. Hence, for any u £ U, trace(uM) = 
trace M. Since M is arbitrary, we conclude that u = /.] 

4.5. Chevalley's Theorem and applications 

(4.5.1) Notation. For a map p: G — > Z and g e G, we often write g p 
for the image of g under p. That is, g p is another notation for p(g). 

(4.5.2) Proposition (Chevalley's Theorem). A subgroup H of a real 
algebraic group G is Zariski closed if and only if, for some m, there 
exist 

• a polynomial homomorphism p: G — > SL(m,R), and 

• a vector v e R m , 

such that H = {h e G \ vh? eiw}. 

Proof. (<^) This follows easily from Eg. 4.1.2(7) and Excr. 4.3#10. 

(=>) There is no harm in assuming G = SL(£, M.). There is a finite 
subset Q of R[xi,i, . . . ,xe,e], such that H = Var(Q) (see Exer. 4.1#7c). 
Choose d e Z+, such that degQ < d for all Q e Q, and let 

• V = {Q e R[xi,i, . . .,xe, t ] \ degQ < d} and 
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• W = { Q e V | Q{h itj ) = for all h e if }. 
Thus, we have if = PlQeVK Var({Q}). 

There is a natural homomorphism p from SL(f, R) to the group 
SL(V) of (special) linear transformations on V, defined by 

(Qg")(xij) = Q((gx)ij) (4.5.3) 

(see Exer. 2a). Note that we have StabsL(^,R)(W) = H (see Exer. 2b). 
By taking a basis for V, we may think of p as a polynomial homo- 
morphism into SL(dimV, R) (see Exer. 2c). Then this is almost exactly 
what we want; the only problem is that, instead of a 1-dimensional 
space Rv, we have the space W of (possibly) larger dimension. 

To complete the proof, we convert W into a 1-dimcnsional space, 
by using a standard trick of multilinear algebra. For k = dim W, we let 

V = A fc V and W = A fc W C V, 

where /\ k V denotes the fcth exterior power of V. Now p naturally 
induces a polynomial homomorphism p' : SL(£,R) — » SL(V'), and, for 
this action, H = StabsL(<,R)(W) (see Exer. 3). By choosing a basis 

for V', we can think of p' as a homomorphism into SL ^( d ™ y ),M^. 

Since dimW = ( ™ W ) = 1, we obtain the desired conclusion (with p' 
in the place of p) by letting v be any nonzero vector in W'. □ 

Proof of Thm. 4.3.4. From Chevalley's Theorem (4.5.2), we know 
there exist 

• a polynomial homomorphism p : SL(£, R) — » SL(m, R), for some m, 
and 

• a vector v e R m , 

such that G = {j £ SL(£,R) | wg^ e Ru}. Furthermore, from the 
explicit description of p in the proof of Prop. 4.5.2, we see that it 
satisfies the conclusions of Cor. 4.3.6 with SL(£, R) in the place of G 
(cf. Exer. 4). Thus, for any g E SL(^,M), we have 

(9u) p = (g P U (9h) p - (g")h, and (g e y = (g") e . 

For any g E G, we have vg p E Rw. In other words, v is an eigen- 
vector for g p . Then v is also an eigenvector for (g p ) u (see Exer. 4.3#7). 
Since (g u ) p = {g p ) Ul this implies v(g u ) p E Ru, so g u E G. By the same 
argument, g^ E G and g e E G. □ 

Chevalley's Theorem yields an explicit description of the hyperbolic 

tori. 

(4.5.4) Corollary. Suppose T is a connected group of diagonal matrices 
in SL(£, R), and let d = dimT. Then T is almost Zariski closed if and 
only if there are linear functionals Ai, . . . , \i : R d — > R, such that 
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1) T= < 


f 








> 

x ER d > 




< 




e \e(x) 







2) /or eac/i i, there are integers m, . . . , n,j, smc/i i/iai 

A 4 (a;i, . . . , x^) = riixi + • • • + n d x d {or all x £ R d . 
Proof. Combine Prop. 4.5.2 with Cor. 4.4.13 (sec Exer. 5). □ 

Exercises for §4.5. 

#1. Suppose Q € R[£i,i, . . .,xe,e] and o e SL(£,R). 

• Let <j> : SL(£, R) — > R be the polynomial function correspond- 
ing to Q, and 

• define 0': SL(£,R) -> R by 0'(a;) = 0(gx). 

Show there exists Q' G R[a:i,i, . . . , xt,(\, with degQ' = degQ, 
such that <f)' is the polynomial function corresponding to Q' . 
[Hint: For fixed g, the matrix entries of gh are linear functions of h] 

#2. Define p: SL(£,R) -> SL(V) as in Eq. (4.5.3). 

(a) Show p is a group homomorphism. 

(b) For the subspace W defined in the proof of Prop. 4.5.2, show 
H = Stab SL( / iR )(W). 

(c) By taking a basis for V, we may think of p as a map into 
SL(dimV, R). Show p is a polynomial. 

[Hint: (2b) We have QcW.} 

#3. Suppose 

• VF is a subspace of a real vector space V, 

• g is an invertible linear transformation on V, and 

• k = dim W. 

Show /\ k (Wg) = A fe W if and only if VFg = W. 
#4. Define p: SL(£,R) -» SL(V) as in Eq. (4.5.3). 

(a) Show that if (/ is hyperbolic, then p(g) is hyperbolic. 

(b) Show that if g is elliptic, then p{g) is elliptic. 

(c) Show that if g is unipotent, then p(g) is unipotent. 

[Hint: (4a,4b) If g is diagonal, then any monomial is an eigenvector of 
9 P ] 

#5. Prove Cor. 4.5.4. 
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4.6. Subgroups that are almost Zariski closed 

We begin the section with some results that guarantee certain types 
of groups are almost Zariski closed. 

(4.6.1) Proposition. Any compact subgroup of SL(£, M.) is Zariski 
closed. 

Proof. Suppose C is a compact subgroup of SL(^,K), and g is an 
element of SL(£, R) \ C. It suffices to find a polynomial <j> on SL(^,R), 
such that <p{C) = 0, but <j>(g) ^ 0. 

The sets C and Cg are compact and disjoint, so, for any e > 0, 
the Stone- Weierstrass Theorem implies there is a polynomial <j>o, such 
that (f>o (c) < e and (j)o(cg) > 1 — e for all c e C. (For our purposes, 
we may choose any e < 1/2.) For each c € C, let <fi c (x) = <fr(cx), so 
(f) c is a polynomial of the same degree as <po (see Exer. 4.5#1). Define 
4>: SL(£, M.) — > R by averaging over c <G C: 

<^(x) = / </> c (x) dc, 

where dc is the Haar measure on C, normalized to be a probability 
measure. Then 

1) 4>{c) < e for c e C, 

2) $(g) > 1 - e, 

3) </> is constant on C (because Haar measure is invariant), and 

4) <j> is a polynomial function (each of its coefficients is the average 
of the corresponding coefficients of the <p c 's). 

Now let <p(x) = (j>{x) — </>(c) for any c e C. □ 

(4.6.2) Proposition. IfU is a connected, unipotent subgroup of SL(£, R), 
f/ien [/ is Zariski closed. 

Proof {requires some Lie theory). By passing to a conjugate, we may 
assume [/cDj. The Lie algebra ilf of is the space of strictly lower- 
triangular matrices (see Exer. 1). Because A 1 = for A e iLj, the 
exponential map 

eMA)=I + A+±A 2 + --- + j r ^A e - 1 
is a polynomial function on iii, and its inverse, the logarithm map 

log(/ + N) = N - l -N 2 + iiV 3 ± • • • ± j^jN e -\ 
is a polynomial function on U^. 
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Therefore exp is a bijection from il^ onto U^, so U = cxpU, where 
U is the Lie algebra of U. This means 

U = { u £ Vi | logw £ u}. 

Since log is a polynomial function (and U, being a linear subspace, 
is defined by polynomial equations — in fact, linear equations), this 
implies that U is defined by polynomial equations. Therefore, U is 
Zariski closed. □ 

The following result is somewhat more difficult; we omit the proof. 

(4.6.3) Theorem. If L is any connected, semisimple subgroup of 
SL(£,M), then L is almost Zariski closed. 

The following three results show that being almost Zariski closed 
is preserved by certain natural operations. We state the first without 
proof. 

(4.6.4) Proposition. If A and B are almost- Zariski closed subgroups 
of SL(f,R), such that AB is a subgroup, then AB is almost Zariski 
closed. 

(4.6.5) Corollary. If G and H are almost Zariski closed, and p is a 
polynomial homomorphism from G to H, then the image p{G) is an 
almost- Zariski closed subgroup of H. 

Proof. By passing to a finite-index subgroup, we may assume G is con- 
nected. Write G = (TL) K U, as in Thm. 4.4.7. From Prop. 4.6.4, it suf- 
fices to show that p(U), p(L), and p(T) are almost Zariski closed. The 
subgroups p(U) and p(L) are handled by Prop. 4.6.2 and Thm. 4.6.3. 

Write T = Th x T c , where T/j is hyperbolic and T c is com- 
pact (see Cor. 4.4.12). Then p(T c ), being compact, is Zariski closed 
(see Prop. 4.6.1). The subgroup p(Th) is handled easily by combining 
Cors. 4.5.4 and 4.4.13 (see Exer. 2). □ 

(4.6.6) Corollary. If G is any connected subgroup of SL(£, R), then 
the commutator subgroup [G, G] is almost Zariski closed. 

Proof. Write G = (LT) K U, as in Thm. 4.4.7. Because T is abelian and 
[L, L] = L, we see that [G, G] is a (connected subgroup of L k U that 
contains L. Hence [G, G] = L x U, where U = [G, G] D U (see Exer. 3). 
Furthermore, since [G, G] is connected, we know U is connected, so 
U is Zariski closed (see Prop. 4.6.2). Since L is almost Zariski closed 
(see Thm. 4.6.3), this implies [G, G] = LU is almost Zariski closed (see 
Prop. 4.6.4), as desired. □ 

(4.6.7) Corollary. If G is any connected subgroup of SL(£, R), then 
P,G\ = [G, G], so 13 /G is abelian. 



138 



4- Facts about Algebraic Groups 



Proof. Define c: G x G — > G by c(g, h) = g 1 h 1 gh = [g, h]. Then c is 
a polynomial (see Exer. 4.3#9). Since c(G x G) c [G, G] and [G, G] is 

almost Zariski closed, we conclude immediately that [G, G]° C [G, G] 
(cf. Exer. 4.3#10). This is almost what we want, but some additional 
theory (which we omit) is required in order to show that [G, G] is 
connected, rather than having finitely many components. 

Because [G, G] c G, it is immediate that G/G is abelian. □ 

For connected groups, we now show that tori present the only ob- 
struction to being almost Zariski closed. 

(4.6.8) Corollary. If G is any connected subgroup of SL(£, R), then 
there is a connected, almost- Zariski closed torus T of G, such that GT 
is almost Zariski closed. 

Proof. Write (f = (TL) x U, with T, L, U as in Thm. 4.4.7. Because 
L = [L,L] C [G,G], we know L C G (sec Cor. 4.6.7). Furthermore, 
because T normalizes G (see Exer. 4.3#12), we may assume T C G, 
by replacing G with GT. 

Therefore G = (TL) x (UCiG) (see Exer. 3). Furthermore, since G 
is connected, we know that UnG is connected, so UC\G is Zariski closed 
(see Prop. 4.6.2). Then Prop. 4.6.4 implies that G = (TL) x (Uf) G) is 
almost Zariski closed. □ 

We will make use of the following technical result: 

(4.6.9) Lemma. Show that if 

• G is an almost- Zariski closed subgroup o/SL(£, M), 

• H and V are connected subgroups of G that are almost Zariski 
closed, and 

• /: V — > G is a rational function (not necessarily a homomor- 
phism), with /(e) = e, 

then the subgroup (H,f(V)) is almost Zariski closed. 

Plausibility argument. There is no harm in assuming that G = 

o 

(f(H),H) , so we wish to show that H and f(V), taken together, 
generate G. Since [G,G]H is 

• almost Zariski closed (see Prop. 4.6.4), 

• contained in (H,f(V)) (see Cor. 4.6.7), and 

• normal in G (because it contains [G, G]), 

there is no harm in modding it out. Thus, we may assume that G is 
abelian and that H — {e}. 

Now, using the fact that G is abelian, we have G = A x C x U, 
where A is a hyperbolic torus, C is a compact torus, and U is unipotent 
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(see Thm. 4.4.7 and Cor. 4.4.12). Because these are three completely 
different types of groups, it is not difficult to believe that there are 
subgroups Ay, Cy, and Uy of A, C, and V, respectively, such that 
( f(V)) =A v xC v xU v (cf. Excr. 4). 

Now Uy, being connected and unipotent, is Zariski closed (see 
Prop. 4.6.2). The other two require some argument. □ 



Exercises for §4.6. 

#1. Show that every unipotent real algebraic group is connected and 
simply connected. 
[Hint: See proof of (4.6.2).] 

#2. Complete the proof of Cor. 4.6.5, by showing that if T is a hy- 
perbolic torus, and p: T — > SL(m, R) is a polynomial homomor- 
phism, then p(T) is almost Zariski closed. 
[Hint: Use Cors. 4.5.4 and 4.4.13.] 

#3. Show that if G is a subgroup of a semidirect product Atx B, and 
A C G, then G = A K (G ("1 B). If, in addition, G is connected, 
show that G n B is connected. 

#4. Suppose Q: R — > R is any nonconstant polynomial with Q(0) = 
0, and define /:M^D 2 xU 2 C SL(4, R) by 



/(*) 



+ t 2 

1/(1 + f 2 ) 









1 

Q(t) 1 



Show (f(R)) = D 2 x U 2 . 



4.7. Borel Density Theorem 

The Borel Density Theorem (4.7.1) is a generalization of the impor- 
tant fact that if T = SL(^ Z), then F = SL(£, R) (see Exer. 1). Because 
the Zariski closure of T is all of SL(^, R), we may say that V is Zariski 
dense in SL(£, R). That is why this is known as a "density" theorem. 

(4.7.1) Proposition (Borel Density Theorem). // T is any lattice in 
any closed subgroup G of SL(£, R), then the Zariski closure T of T 
contains 

1) every unipotent element of G and 

2) every hyperbolic element of G. 

We precede the proof with a remark and two lemmas. 

(4.7.2) Remark. 
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1) If G is a compact group, then the trivial subgroup V = {e} is a 
lattice in G, and T = {e} does not contain any nontrivial elements 
of G. This is consistent with Prop. 4.7.1, because nontrivial ele- 
ments of a compact group are neither unipotcnt nor hyperbolic 
(see Cor. 4.4.10). 

2) Although we do not prove this, T actually contains every unipo- 
tent or hyperbolic element of G, not only those of G. 

(4.7.3) Lemma (Poincare Recurrence Theorem). Let 

• (fi, d) be a metric space; 

• T: Q — > £1 be a homeomorphism; and 

• [i be a T ' -invariant probability measure on A. 

Then, for almost every a G 0, there is a sequence — ► oo ; such that 
T n "a -> a. 

Proof. Let 

A e = {aen\Vm>0, d(T m a, a) > e }. 

It suffices to show n{A € ) = for every e. 

Suppose /J,(A e ) > 0. Then we may choose a subset B of A e , such 
that fi(B) > and diam(B) < e. The sets B,T- 1 B,T- 2 B, . . . cannot 
all be disjoint, because they all have the same measure and /z(Sl) < oo. 
Hence, T~ m B n T~ n B ^ 0, for some m,n e Z+ with to > n. By 
applying T n , we may assume n = 0. For a e j 1 -™^ p we have 
T m a G B and a e B, so 

rf(T m a, a) < diam(B) < e. 

Since «eJSci e , this contradicts the definition of A e . □ 

(4.7.4) Notation. 

• Recall that the projective space MP" 1-1 is, by definition, the 
set of one-dimensional subspaces of K m . Alternatively, RP m_1 
can be viewed as the set of equivalence classes of the equivalence 
relation on R m \ {0} defined by 

v ~ w v = aw for some ael \ {0}. 

From the alternate description, it is easy to see that MP m_1 is 
an (to — l)-dimensional smooth manifold (see Exer. 3). 

• There is a natural action of SL(m, R) on MP m_ , defined by 
[v]g = [vg], where, for each nonzero v € M. m , we let [v] = Rv 
be the image of v in MP m_1 . 
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(4.7.5) Lemma. Assume 

• g is an element of SL(m, K) that is either unipotent or hyperbolic, 

• (i is a probability measure on the projective space MP m_1 ; and 

• fi is invariant under g. 

Then /i is supported on the set of fixed points of g. 

Proof. Let v be any nonzero vector in R m . For definiteness, let us 
assume g is unipotent. (See Exer. 4 for a replacement of this paragraph 
in the case where g is hyperbolic.) Letting T = g — I, we know that 
T is nilpotent (because g is unipotent), so there is some integer r > 0, 
such that vT r ^ 0, but vT r+1 = 0. We have 

vT r g = (vT r )(I + T) = vT r + vT r+1 = vT r + = vT r , 

so [vT r ] e MP" 1-1 is a fixed point for g. Also, for n e N, we have 



(because, for k < r, we have (£)/(") — * as n — > oo). Thus, [u].g™ 
converges to a fixed point of 5, as n — > 00. 

The Poincare Recurrence Theorem (4.7.3) implies, for /x-almost 
every [w] G MP m_ , that there is a sequence rife — > 00, such that 
[v]g nk — > [v]. On the other hand, we know, from the preceding para- 
graph, that [v]g nk converges to a fixed point of g. Thus, ^-almost every 
element of MP m_1 is a fixed point of g. In other words, /U is supported 
on the set of fixed points of g, as desired. □ 

Proof of the Borel Density Theorem (4.7.1). By Chevalley's The- 
orem (4.5.2), there exist 

• a polynomial homomorphism p : SL(£,R) — > SL(m, M), for some m, 
and 

• a vector v £ M. m , 

such that r = {g e SL(£,R) \ vg p € Rv}. In other words, letting [v] 
be the image of v in MP" 1-1 , we have 

r={ 5 eSL(£,R) I [v}gP = [v]}. (4.7.6) 

Since p(T) fixes [v], the function p induces a well-defined map p: T\G — > 

MP" 1 " 1 : 

p(r 5 ) - H 5 ". 

Because T is a lattice in G, there is a G-invariant probability measure p 
on r\G. The map p pushes this to a probability measure /1 = on 
MP™" 1 , defined by fi(A) = ^ (p _1 (^)) for A C MP" 1 " 1 . Because Mo 
is G-invariant and p is a homomorphism, it is easy to see that fi is 
p(G)-invariant. 
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Let g be any element of G that is either unipotent or hyperbolic. 
From the conclusion of the preceding paragraph, we know that p is 
(/ p -invariant. Since g p is either unipotent or hyperbolic (see Cor. 4.3.6), 
Lem. 4.7.5 implies that p is supported on the set of fixed points of g p . 
Since [v] is obviously in the support of p (see Exer. 5), we conclude 
that [v] is fixed by g p ; that is, [v]g p = [v]. From (4.7.6), we conclude 
that g £ T, as desired. □ 

Exercises for §4.7. 

#1. Show (without using the Borel Density Theorem) that the Zariski 
closure of SL(£, Z) is SL(£, R). 

[Hint: Let T = SL(£,Z), and let H = f° . If g € SL(£,Q), then g^Tg 
contains a finite-index subgroup of F. Therefore g normalizes H. Be- 
cause SL(£, Q) is dense in SL(n,R), this implies that H is a normal 
subgroup of SL(£,R). Now apply Eg. 4.4.5(1).] 

#2. Use the Borel Density Theorem to show that if T is any lattice 
in $L(£, R), then f = SL{£, R). 

[Hint: SL(£, R) is generated by its unipotent elements.] 

#3. Show that there is a natural covering map from the (to— l)-sphere 
S™- 1 onto MP™" 1 , so MP" 1 " 1 is a C°° manifold. 

#4. In the notation of Lem. 4.7.5, show that if g is hyperbolic, and 
v is any nonzero vector in R m , then [v]g n converges to a fixed 
point of g, as n — > oo. 

[Hint: Assume g is diagonal. For v = (vi, . . . , v m ), calculate vg n \ 

#5. In the notation of the proof of Prop. 4.7.1, show that the support 
of /j, is the closure of [w]G p . 

[Hint: If some point of [v]G p is contained in an open set of measure 0, 
then, because p, is invariant under p(G), all of [v]G p is contained in an 
open set of measure 0.] 

#6. (The Borel Density Theorem, essentially as stated by Borel) Sup- 
pose 

• G is a connected, semisimple subgroup of SL(£, R), such that 
every simple factor of G is noncompact, and 

• r is a lattice in G. 
Show: 

(a) G C F, 

(b) T is not contained in any proper, closed subgroup of G that 
has only finitely many connected components, and 

(c) if p: G — > SL(to, R) is any continuous homomorphism, then 
every element of p(G) is a finite linear combination (with real 
coefficients) of elements of p(T). 
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[Hint: Use Prop. 4.7.1. (6c) The subspace of Mat mxm (R) spanned by 
p(r) is invariant under multiplication by p(T), so it must be invariant 
under multiplication by p(G).] 

#7. Suppose 

• G is a closed, connected subgroup of SL(^, R), and 

• T is a lattice in G. 

Show there are only countably many closed, connected sub- 
groups S of G, such that 

(a) r n S is a lattice in S, and 

(b) there is a one-parameter unipotent subgroup u* of S, such 
that (r n is dense in S. 

[Hint: You may assume, without proof, the fact that every lattice 
in every connected Lie group is finitely generated. Show S C T n S. 
Conclude that S is uniquely determined by T n S.] 

#8. Suppose 

• G is an almost-Zariski closed subgroup of SL(£, R), 

• U is a connected, unipotent subgroup of G, 

• T is a discrete subgroup of G, 

• /z is an ergodic [/-invariant probability measure on T\G, and 

• there does not exist a subgroup H of G, such that 

o H is almost Zariski closed, 
o U C H, and 

o some ii-orbit has full measure. 

Show, for all x € T\G and every subset V of G, that if /i(xV) > 0, 
then G C V. 

[Hint: Assume V is Zariski closed and irreducible. Let 

U xV = { u € U | xVw = } and U v = { u G [/ | V« = V }. 

Assuming that V is minimal with //(xV) > 0, we have 

fi(xV n a;Vu) = for m G (7 \ (7^v- 

So (7 /C/^v is finite. Since U is connected, then LW = U. Similarly 
(and because F is countable), U /Uv is countable, so Uv = U. 

Let Ty = { 7 G T | V7 = V}. Then \x defines a measure fiv 
on rv\V, and this pushes to a measure ~pv on IV\V^. By combin- 
ing Chevalley's Theorem (4.5.2), the Borel Density Theorem (4.7.5), 
and the ergodicity of U , conclude that Tiy is supported on a single 
point. Letting H — {Fv,U}, some if-orbit has positive measure, and 
is contained in xV.] 
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4.8. Subgroups denned over Q 

In this section, we briefly discuss the relationship between lattice 
subgroups and the integer points of a group. This material is not needed 
for the proof of Ratner's Theorem, but it is related, and it is used in 
many applications, including Margulis' Theorem on values of quadratic 
forms (1.2.2). 

(4.8.1) Definition. A Zariski closed subset Z of SL(£, M.) is said to be 
defined over Q if the defining polynomials for Z can be taken to have 
all of their coefficients in Q; that is, if Z = Var(Q) for some subset Q 
of Q[xi 4 , . . .,xe,e\- 

(4.8.2) Example. The algebraic groups in (1-5) of Eg. 4.1.2 are defined 
over Q. Those in (6-8) may or may not be defined over Q, depending 
on the particular choice of v, V, or Q. Namely: 

A) The stabilizer of a vector v is defined over Q if and only if v is a 
scalar multiple of a vector in 1/ (see Exer. 3). 

B) The stabilizer of a subspace V of M. e is defined over Q if and only 
if V is spanned by vectors in Z e (see Exer. 4). 

C) The special orthogonal group SO(Q) of a nondegenerate qua- 
dratic form Q is defined over Q if and only if Q is a scalar multiple 
of a form with integer coefficients (see Exer. 5). 

(4.8.3) Definition. A polynomial function <fi: H — > SL(n, K) is de- 
fined over Q if it can be obtained as in Defn. 4.3.5, but with 
R[xi,i, . . . , xe,e] replaced by Q[xi,i, . . . , xt t t\ in 4.3.5(1). That is, only 
polynomials with rational coefficients are allowed in the construction 
of <j>. 

The fact that Z fe is a lattice in R fe has a vast generalization: 

(4.8.4) Theorem (Borcl and Harish-Chandra). Suppose 

• G is a Zariski closed subgroup of SL(£, K), 

• G is defined over Q, and 

• no nontrivial polynomial homomorphism from G° to D 2 is defined 
over Q, 

then G n SL(^, Z) is a lattice in G. 

(4.8.5) Corollary. SL(£,Z) is a lattice m SL(£,R). 

(4.8.6) Example. © 2 H SL(2, Z) = {±1} is finite, so it is not a lattice 
in ©2- 

It is interesting to note that Cor. 4.8.5 can be proved from prop- 
erties of unipotent flows. (One can then use this to obtain the general 
case of Thm. 4.8.4, but this requires some of the theory of "arithmetic 
groups" (cf. Exer. 11).) 
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Direct proof of Cor. 4.8.5. Let G = SL(£, R) and T = SL(£, Z). For 

• a nontrivial, unipotent one-parameter subgroup u*, and 

• a compact subset K of T\G, 
we define/: T\G —> R-° by 

1 /" L 

/(a;) = liminf - / xkIxu 1 ) dt, 

where \k is the characteristic function of K. 

The key to the proof is that the conclusion of Thm. 1.9.2 can be 
proved by using the polynomial nature of u l — without knowing that T 
is a lattice. Furthermore, a single compact set K can be chosen to work 
for all x in any compact subset of T\G. This means that, by choosing K 
appropriately, we may assume that / > on some nonempty open set. 

Letting u be the Haar measure on T\G, we have f r \ G f du < 

u(K) < oo, so / g L^ryz./i). 

It is easy to see, from the definition, that / is w*-invariant. There- 
fore, the Moore Ergodicity Theorem implies that / is essentially G- 
invariant (see Exer. 3.2#6). So / is essentially constant. 

If a nonzero constant is in L 1 , then the space must have finite 
measure. So T is a lattice. □ 

Exercises for §4.8. 

#1. Show that if C is any subset of SL(€, Q), then C is defined over Q. 
[Hint: Suppose C = Var(Q), for some Q C S d , where S d is the set of 
polynomials of degree < d. Because the subspace { Q G S d Q(C) = 
} of S d is defined by linear equations with rational coefficients, it is 
spanned by rational vectors.] 

#2. (Requires some commutative algebra) Let Z be a Zariski closed 
subset of SL(£, R). Show that Z is defined over Q if and only if 
cr(Z) = Z, for every Galois automorphism of C. 
[Hint: («=) You may assume Hilbert's Nullstellensatz, which implies 
there is a subset Q of Q[xi t i, . . . ,xt t e], such that C = Var(Q), where 
Q is the algebraic closure of Q. Then Q may be replaced with some 
finite Galois extension F of Q, with Galois group $. For Q € Q, any 
symmetric function of { \ <f> € $ } has rational coefficients.] 

#3. Verify Eg. 4.8.2(A). 

[Hint: (=>) The vector v fixed by StabsL(^R) (v) is unique, up to a 
scalar multiple. Thus, G Rv, for every Galois automorphism <j> of C. 
Assuming some coordinate of v is rational (and nonzero), then all the 
coordinates of v must be rational.] 

#4. Verify Eg. 4.8.2(B). 

[Hint: (=>) Cf. Hint to Exer. 3. Any nonzero vector in V with the 
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minimal number of nonzero coordinates (and some coordinate rational) 
must be fixed by each Galois automorphism of C. So V contains a 
rational vector v. By a similar argument, there is a rational vector 
that is linearly independent from v. By induction, create a basis of 
rational vectors.] 

#5. Verify Eg. 4.8.2(C). 

[Hint: (=>) Cf. Hint to Exer. 3. The quadratic form Q is unique, up to 
a scalar multiple.] 

#6. Suppose Q is a quadratic form on R n , such that SO(Q)° is defined 
over Q. Show that Q is a scalar multiple of a form with integer 
coefficients. 

[Hint: The invariant form corresponding to SO(Q) is unique up to a 
scalar multiple. We may assume one coefficient is 1, so Q is fixed by 
every Galois automorphism of C] 

#7. Suppose 

• G is a Zariski closed subgroup of SL(^,R), 

• G° is generated by its unipotent elements, and 

• G n SL(4 Z) is a lattice in G. 
Show G is defined over Q. 

[Hint: Use the Borel Density Theorem (4.7.1).] 

#8. Suppose a: G — > SL(m, R) is a polynomial homomorphism that 
is defined over Q. Show: 

(a) cr(GnSL(^Z)) C SL(m,Q) , and 

(b) there is a finite-index subgroup r of G n SL(£, Z), such that 
a(T) C SL(m,Z). 

[Hint: (8b) There is a nonzero integer k, such that if g G G H SL(£, Z) 
and p = / (mod fc), then a(g) G SL(m,Z).] 
#9. Suppose G is a Zariski closed subgroup of SL(£,R). Show that 
if some nontrivial polynomial homomorphism from G° to 1D>2 is 
defined over Q, then Gfl SL(£,Z) is not a lattice in G. 

#10. Show that if G is a Zariski closed subgroup of SL(£, K) that is 
defined over Q, and G° is generated by its unipotent elements, 
then G n SL(^ Z) is a lattice in G. 

#11. Suppose 

• G is a connected, noncompact, simple subgroup of SL(£, R), 

• r = GnSL(^,Z), and 

• the natural inclusion r: T\G SL(^, Z)\ SL(^, R), defined 
by t(Tx) = SL(£, Z)x, is proper. 

Show (without using Thm. 4.8.4) that T is a lattice in G. 
[Hint: See the proof of Cor. 4.8.5.] 
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In this section, we briefly recall (without proof) some facts from 
the theory of Lie groups. 

(4.9.1) Definition. A group G is a Lie group if the underlying set is a 
C°° manifold, and the group operations (multiplication and inversion) 
are C°° functions. 

A closed subset of a Lie group need not be a manifold (it could be 
a Cantor set, for example), but this phenomenon does not occur for 
subgroups: 

(4.9.2) Theorem. Any closed subgroup of a Lie group is a Lie group. 

It is easy to see that the universal cover of a (connected) Lie group 
is a Lie group. 

(4.9.3) Definition. Two connected Lie groups G and H are locally 
isomorphic if their universal covers are C°° isomorphic. 

We consider only linear Lie groups; that is, Lie groups that are 
closed subgroups of SL(£, R), for some I. The following classical theorem 
shows that, up to local isomorphism, this results in no loss of generality. 

(4.9.4) Theorem ( Ado-Iwasawa) . Any connected Lie group is locally 
isomorphic to a closed subgroup of SL(£, R), for some £. 

It is useful to consider subgroups that need not be closed, but may 
only be immersed submanifolds: 

(4.9.5) Definition. A subgroup H of a Lie group G is a Lie subgroup 

if there is a Lie group H and an injective C°° homomorphism a : H — > 
G, such that H = a (Ho). Then we consider H to be a Lie group, by 
giving it a topology that makes a a homcomorphism. (If H is not closed, 
this is not the topology that H acquires by being a subset of G.) 

(4.9.6) Definition. Let G be a Lie subgroup of SL(£,R). The tangent 
space to G at the identity element e is the Lie algebra of G. It is, 
by definition, a vector subspace of the space Mat£ X £(R) of £ x £ real 
matrices. 

The Lie algebra of a Lie group G, H, U, S, etc., is usually denoted 
by the corresponding lower-case gothic letter 0, f), U, S, etc. 

(4.9.7) Example. 

1) The Lie algebra of is 















the space of strictly lower-triangular matrices. 
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2) Let c?det: Mat„ xn (R) — * M be the derivative of the determinant 
map det at the identity matrix /. Then (ddet)(A) = trace A 
Therefore the Lie algebra of SL(£, R) is 

sl(£,R) = {Ae MaW(R) | trace A = 0}. 

So the Lie algebra of any Lie subgroup of SL(£,R) is contained 
in sl(£,R). 

3) The Lie algebra of ©£ is 



a\ + ■ ■ ■ + ai = 



ar Q 

ai_ 
the space of diagonal matrices of trace 0. 

(4.9.8) Definition. 

1) For x,y e Mat£ X ^(R), let \x, y] = xy—yx. This is the Lie bracket 
of x and y. 

2) A vector subspacc [) of Mat£ X f(R) is a Lie subalgebra if [x, y] € 
V for all x, y € J) . 

3) A linear map r : £J — > f) between Lie subalgcbras is a Lie algebra 
homomorphism if r([x, yfj = [t(x), r(y)] for all x,y G Q. 

(4.9.9) Proposition. 

1) The Lie algebra of any Lie subgroup of SL(^,R) is a Lie subal- 
gebra. 

2) Any Lie subalgebra f) of Sl(£, R) is the Lie algebra of a unique 
connected Lie subgroup H of SL(£, R). 

3) The differential of a Lie group homomorphism is a Lie algebra 
homomorphism. That is, if <j>: G — > H is a C°° Lie group homo- 
morphism, and Dip is the derivative of <f> at e, then D(f> is a Lie 
algebra homomorphism from Q to f). 

4) A connected Lie group is uniquely determined (up to local iso- 
morphism) by its Lie algebra. That is, two connected Lie groups 
G and H are locally isomorphic if and only if their Lie algebras 
are isomorphic. 

(4.9.10) Definition. The exponential map 

exp: 5l(£,R) -> SL(£,R) 
is defined by the usual power series 
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(4.9.11) Example. Let 



Then, letting 
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as usual in SL(2, R), it is easy to see that: 

1) exp(sa) = a s , 

2) exp(tu) = u l , 

3) exp(ru) = v r , 

4) \u, a] = 2u, 

5) [v,a\ — ~2v, and 

6) \v,u\ = a. 

(4.9.12) Proposition. Let Q be the Lie algebra of a Lie subgroup G 
of SL(^R). Then: 

1) expfl C G. 

2) For any g £ Q, the map R — > G defined by g* — exp(tg) is a 
one-parameter subgroup of G. 

3) The restriction of exp to some neighborhood of in Q is a diffeo- 
morphism onto some neighborhood of e in G. 

(4.9.13) Definition. 

1) A group G is solvable if there is a chain 

e = G < Gi < ■ ■ ■ < Gk = G 
of subgroups of G, such that, for 1 < i < k, 

(a) Gi-i is a normal subgroup of Gi, and 

(b) the quotient group Gi/Gi-i is abelian. 

2) Any Lie group G has a unique maximal closed, connected, solv- 
able, normal subgroup. This is called the radical of G, and is 
denoted Rad G. 

3) A Lie group G is said to be semisimple if RadG = {e}. 

(4.9.14) Remark. According to Dcfn. 4.4.1, G is semisimple if G° has 
no nontrivial, connected, abelian, normal subgroups. One can show 
that this implies there are also no nontrivial, connected, solvable nor- 
mal subgroups. 

(4.9.15) Theorem. Any Lie group G has a closed, semisimple sub- 
group L, such that 

1) L is semisimple and 
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2) G = LR&dG. 

The subgroup L is called a Levi subgroup of G; it is usually not 
unique. 

(4.9.16) Warning. The above definition is from the theory of Lie 
groups. In the theory of algebraic groups, the term Levi subgroup 
is usually used to refer to a slightly different subgroup — namely, the 
subgroup LT of Thm. 4.4.7. 

(4.9.17) Theorem (Lie-Kolchin Theorem). If G is any connected, solv- 
able Lie subgroup of SL(£, R), then there exists h £ SL(£, C), such that 
h- x Gh C B e lJ e . 

(4.9.18) Definition. Let Q be the Lie algebra of a Lie subgroup G of 
SL(£,R). 

• We use GL(g) to denote the group of all invertible linear trans- 
formations — > 0- This is a Lie group, and its Lie algebra qI(q) 
consists of all (not necessarily invertible) linear transformations 
0^0- 

• We define a group homomorphism Ad G : G — > GL(fj) by 

x(Adc g) — g x xg. 

Note that Ado g is the derivative at e of the group automorphism 
x i ► g~ 1 xg 1 so AdG is a Lie algebra automorphism. 

• We define a Lie algebra homomorphism ad : Q — > Qi(Q) by 

x(a,d g g) = \x,g}. 
We remark that ad is the derivative at e of Adc- 

(4.9.19) Remark. A Lie group G is unimodular (that is, the right 
Haar measure is also invariant under left translations) if and only if 
dct(Ad G .g) = 1, for all g e G. 

(4.9.20) Proposition. The maps exp, Ada an d & dg are natural. That 
is, if p: G — > H is a Lie group homomorphism, and dp is the derivative 
of p at e, then 

1) {cxp g)P = exp(dp(g)), 

2) dp(x(Ad G g)) = (dpx)(Ad H g p ), and 

3) dp(x(ad g g)) = {dpx)(Ad^ dp{g)). 

(4.9.21) Corollary. We have Adc(cxpg) = cxp(ad g). That is, 

x(Ad G (exp5)) = x + x(&d g g) + ^x(ad g) 2 + ^x(ad s g) 3 H 

The commutation relations (4,5,6) of Eg. 4.9.11 lead to a complete 
understanding of all st(2, R)-modules: 
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(4.9.22) Proposition. Suppose 

• S is a finite- dimensional real vector space, and 

• p: sl(2,R) — > sl(S) is a Lie algebra homomorphism. 

Then there is a sequence Ai, . . . , A„ of natural numbers, and a basis 



of S, such that, for all we have: 

1) w lJ a p = (2j - Xi)wij, 

2) Wiju p = (Ai -j)wij+i, and 

3) w hj v p = jwij-!. 

(4.9.23) Remark. The above proposition has the following immediate 
consequences. 

1) Each Wij is an eigenvector for a p , and all of the eigenvalues are 
integers. (Therefore, a p is diagonalizable over R.) 

2) For any integer A, we let 



This is called the weight space corresponding to A. (If A is an 
eigenvalue, it is the corresponding eigenspace; otherwise, it is 
{0}.) A basis of S\ is given by 



3) For all A, we have S\u p C <Sa+2 and S\v p C S\-2- 

4) The kernel of u p is spanned by {u>i,Ai , ■ ■ ■ , w n,\ n }, and the kernel 
of v p is spanned by {wi,o, • ■ • , Wn,o}- 

5) u p and v p are nilpotent. 

Exercises for §4.9. 

#1. Suppose u* is a nontrivial, one-parameter, unipotent subgroup of 
SL(2,R). 

(a) Show that {u 1 } is conjugate to U 2 . 

(b) Suppose a s is a nontrivial, one-parameter, hyperbolic sub- 
group of SL(2,R) that normalizes {u*}. Show there exists 
h e SL(2, R), such that h~ l {a s }h = © 2 and h^i^jh = U 2 . 

#2. Verify the calculations of Eg. 4.9.11. 

#3. Show that if a is a hyperbolic element of SL(^, R), and V is an a- 
invariant subspace of R e , then V has an a-invariant complement. 
That is, there is an a-invariant subspace W of R £ , such that 




S\ = { w e «S | wa p = Xw }. 




1 < i < n, 
A, < |A| 



R* = V © W. 
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[Hint: A subspace of is a-invariant if and only if it is a sum of 
subspaces of eigenspaces.] 

#4. Show that if L is a Levi subgroup of G, then LnRad G is discrete. 
[Hint: L n Rad G is a closed, solvable, normal subgroup of L.\ 

#5. Show that every continuous homomorphism p: R fe — > R is a linear 
map. 

[Hint: Every homomorphism is Q-linear. Use continuity to show that 
p is R-linear.] 

#6. Suppose T is a connected Lie subgroup of D^, and p: T — > R+ is 
a C°° homomorphism. 

(a) Show there exist real numbers a\, ■ ■ ■ ,ate, such that 

p(t) = t?V*# 

for all t e T. 

(b) Show that if p is polynomial, then ai, ■ ■ ■ , at are integers. 
[#mi: (6a) Use Exer. 5.] 

#7. Suppose S and p are as in Prop. 4.9.22. Show: 

(a) No proper p(sl(2, R)) -invariant subspace of S contains ker u p . 

(b) If V and W are p(fi[(2, M)) -invariant subspaces of 5, such 
that VCW, then V n ker C n ker 

[ffini: (7b) Apply (7a) with W in the place of 5.] 

#8. Suppose 

f 1 < i < n, } 

o<i<Aj 

is a basis of a real vector space S, for some sequence Ai, . . . , A„ 
of natural numbers. Show that the equations 4.9.22(1,2,3) deter- 
mine linear transformations a p , u p , and v p on S, such that the 
commutation relations (4,5,6) of Eg. 4.9.11 arc satisfied. Thus, 
there is a Lie algebra homomorphism a: sl(2,R) — > £t(«S), such 
that a(a) = a p , a(u) = u p , and a(v) = v p . 

Notes 

The algebraic groups that appear in these lectures are defined 
over R, and our only interest is in their real points. Furthermore, we 
are interested only in linear groups (that is, subgroups of SL(f , R)), not 
"abelian varieties." Thus, our definitions and terminology are tailored 
to this setting. 

There are many excellent textbooks on the theory of (linear) al- 
gebraic groups, including [5, 16], but they generally focus on algebraic 
groups over C (or some other algebraically closed field). The books of 
V. Platonov and A. Rapinchuk [23, Chap. 3] and A. L. Onishchik and 
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E. B. Vinbcrg [22] are excellent sources for information on algebraic 
groups over R. 

§4.1. Standard textbooks discuss varieties, Zariski closed sets, al- 
gebraic groups, dimension, and the singular set. 

Whitney's Theorem (4.1.3) appears in [23, Cor. 1 of Thm. 3.6, 
p. 121]. 

§4.2. The Zariski closure is a standard topic. 

The notion of being "almost Zariski closed" does not arise over an 
algebraically closed field, so it is not described in most texts. Relevant 
material (though without using this terminology) appears in [23, §3.2] 
and [29, §3.1]. References to numerous specific results on almost-Zariski 
closed subgroups can be found in [28, §3]. 

Exercise 4.2#2 is a version of [16, Prop. 8.2b, p. 59]. 

§4.3. Polynomials, unipotent elements, and the Jordan decompo- 
sition are standard material. However, most texts consider the Jor- 
dan decomposition over C, not M. (Hyperbolic elements and elliptic 
elements are lumped together into a single class of "semisimple" ele- 
ments.) 

The real Jordan decomposition appears in [11, Lem. IX. 7.1, p. 430], 
for example. 

A solution of Exer. 4.3#12 appears in the proof of [29, Thm. 3.2.5, 
p. 42]. 

§4.4. This material is standard, except for Thm. 4.4.9 and its 
corollary (which do not occur over an algebraically closed field). 

The theory of roots and weights is described in many textbooks, 
including [11, 15, 25]. See [11, Table V, p. 518] for a list of the almost- 
simple groups. 

For the case of semisimple groups, the difficult direction {<=) of 
Thm. 4.4.9 is immediate from the "Iwasawa decomposition" G = 
KAN, where K is compact, A is a hyperbolic torus, and N is unipo- 
tent. This decomposition appears in [23, Thm. 3.9, p. 131], or in many 
texts on Lie groups. 

The proof of Engel's Theorem in Exer. 4.4#5 is taken from [16, 
Thm. 17.5, p. 112]. The theorem of Burnside mentioned there (or the 
more general Jacobson Density Theorem) appears in graduate algebra 
texts, such as [17, Cor. 3.4 of Chap. XVII]. 



§4.5. This is standard. 
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§4.6. These results are well known, but do not appear in most 
texts on algebraic groups. 

Proposition 4.6.1 is due to C. Chevalley [9, Prop. 2, §VI.5.2, p. 230]. 
A proof also appears in [1, §8.6]. 

See [13, Thm. 8.1.1, p. 107] for a proof of Prop. 4.6.2. 

See [13, Thm. 8.3.2, p. 112] for a proof of Thm. 4.6.3. 

The analogue of Prop. 4.6.4 over an algebraically closed field is 
standard (e.g., [16, Cor. 7.4, p. 54]). For a derivation of Prop. 4.6.4 
from this, see [28, Lem. 3.17]. 

See [23, Cor. 1 of Prop. 3.3, p. 113] for a proof of Cor. 4.6.5. 

Corollary 4.6.6 is proved in [8, Thm. 15, §11.14, p. 177] and [13, 
Thm8.3.3, p. 113]. 

Completing the proof of Cor. 4.6.7 requires one to know that G/G 
is abelian for every connected, semisimple subgroup G of SL(£, K). In 
fact, G/G is trivial if G is "simply connected" as an algebraic group 
[23, Prop. 7.6, p. 407], and the general case follows from this by using 
an exact sequence of Galois cohomology groups: Gr — -> (G/Z)-r — > 
H'(C/1.Z C ). 

A proof of Lem. 4.6.9 appears in [7, §2.2]. (It is based on the analo- 
gous result over an algebraically closed field, which is a standard result 
that appears in [16, Prop. 7.5, p. 55], for example.) 

Exercise 4.6#1 is a version of [12, Thm. 8.1.1, p. 107]. 

§4.7. This material is fairly standard in ergodic theory, but not 
common in texts on algebraic groups. 

The Borel Density Theorem (4.7.1) was proved for semisimple 
groups in [3] (see Exer. 4.7#6). (The theorem also appears in [19, 
Lem. II.2.3 and Cor. II.2.6, p. 84] and [29, Thm. 3.2.5, pp. 41-42], 
for example.) The generalization to all Lie groups is due to S. G. Dani 
[10, Cor. 2.6]. 

The Poincare Recurrence Theorem (4.7.3) can be found in many 
textbooks on ergodic theory, including [2, Cor. 1.1.8, p. 8]. 
See [23, Thm. 4.10, p. 205] for a solution of Exer. 4.7#1. 
Exercise 4.7#7 is [24, Cor. A(2)]. 
Exercise 4.7#8 is [20, Prop. 3.2]. 

§4.8. This material is standard in the theory of "arithmetic groups." 
(If G is defined over Q, then G n SL(£, Z) is said to be an arithmetic 
group.) The book of Platonov and Rapinchuk [23] is an excellent ref- 
erence on the subject. See [21] for an introduction. There are also nu- 
merous other books and survey papers. 

Theorem 4.8.4 is due to A. Borel and Harish-Chandra [6]. (Many 
special cases had previously been treated by C. L. Siegel [26].) Exposi- 
tions can also be found in [4, Cor. 13.2] and [23, Thm. 4.13]. (A proof of 
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only Cor. 4.8.5 appears in [2, §V.2].) These are based on the reduction 
theory for arithmetic groups, not unipotent flows. 

The observation that Thm. 4.8.4 can be obtained from a variation 
of Thm. 1.9.2 is due to G. A. Margulis [18, Rem. 3.12(11)]. 

§4.9. There are many textbooks on Lie groups, including [11, 12, 
27]. The expository article of R. Howe [14] provides an elementary 
introduction. 
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CHAPTER 5 



Proof of the Measure-Classification 
Theorem 

In this chapter, we present the main ideas in a proof of the follow- 
ing theorem. The reader is assumed to be familiar with the concepts 
presented in Chap. 1. 

(5.0.1) Theorem (Ratner). // 

• G is a closed, connected subgroup of SL(^,R), for some i, 

• T is a discrete subgroup of G, 

• u* is a unipotent one-parameter subgroup of G, and 

• u is an ergodic u* -invariant probability measure on T\G, 
then fi is homogeneous . 

More precisely, there exist 

• a closed, connected subgroup S of G, and 

• a point x in T\G, 
such that 

1) u is S-invariant, and 

2) u is supported on the orbit xS. 

(5.0.2) Remark. If we write x — Tg, for some g € G, and let Ts = 
(g~ 1 Tg) n S, then the conclusions imply that 

1) under the natural identification of the orbit xS with the homoge- 
neous space Fs\S, the measure /x is the Haar measure on Ts\S, 

2) Ts is a lattice in S, and 

3) xS is closed 
(see Exer. 1). 
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(5.0.3) Assumption. Later (see Assump. 5.3.1), in order to simplify 
the details of the proof while losing very few of the main ideas, we will 
make the additional assumption that 

1) /i is invariant under a hyperbolic one-parameter subgroup {a s } 
that normalizes it*, and 

2) (a s , u*) is contained in a subgroup L = (u t 7 a s ,v r ) that is locally 
isomorphic to SL(2,R). 

See §5.9 for a discussion of the changes involved in removing this hy- 
pothesis. The basic idea is that Prop. 1.6.10 shows that we may assume 
Stabc(/x) contains a one-parameter subgroup that is not unipotent. A 
more sophisticated version of this argument, using the theory of alge- 
braic groups, shows that slightly weakened forms of (1) and (2) are 
true. Making these assumptions from the start simplifies a lot of the 
algebra, without losing any of the significant ideas from dynamics. 

(5.0.4) Remark. Note that G is not assumed to be semisimple. Al- 
though the semisimple case is the most interesting, we allow ourselves 
more freedom, principally because the proof relies (at one point, in 
the proof of Thm. 5.7.2) on induction on dimG, and this induction is 
based on knowing the result for all connected subgroups, not only the 
semisimple ones. 

(5.0.5) Remark. There is no harm in assuming that G is almost Zariski 
closed (see Exer. 3). This provides a slight simplification in a couple of 
places (see Exer. 5.4#6 and the proof of Thm. 5.7.2). 

Exercises for §5.0. 

#1. Prove the assertions of Rem. 5.0.2 from the conclusions of Thm. 5.0.1. 

#2. Show that Thm. 5.0.1 remains true without the assumption that 
G is connected. 

[Hint: /x must be supported on a single connected component of T\G. 
Apply Thm. 5.0.1 with G° in the place of G] 

#3. Assume Thm. 5.0.1 is true under the additional hypothesis that 
G is almost Zariski closed. Prove that this additional hypothesis 
can be eliminated. 
[Hint: Y\G embeds in r\SL(<?,R).] 

5.1. An outline of the proof 

Here are the main steps in the proof. 
1) Notation. 

• Let S — Stabc(^t). We wish to show that \i is supported on 
a single /S-orbit. 

• Let Q be the Lie algebra of G and S be the Lie algebra of S. 



5. 1 . An outline of the proof 



161 



• The expanding and contracting subspaces of a s (for s > 0) 
provide decompositions 

= 0- +00+0+ and S = S_ +S +S+, 

and we have corresponding subgroups G_, Go, G + , S-, So, 
and S+ (see Notn. 5.3.3). 

• For convenience, let U = S+. Note that U is unipotent, and 
we may assume {u 1 } C U, so fi is ergodic for U . 

2) We are interested in transverse divergence of nearby orbits. 
(We ignore relative motion along the [/-orbits, and project to 
G 9 U.) The shearing property of unipotent flows implies, for 
a.e. x, y € T\G, that if x w y, then the transverse divergence of 
the U -orbits through x and y is fastest along some direction in S 
(see Prop. 5.2.4). Therefore, the direction belongs to G-Go (see 
Cor. 5.3.4). 

3) We define a certain subgroup 

S+ = { .9 e G_ \VueU, u^gu e G_G [/ } 

of G_ (cf. Dcfn. 5.4.1). Note that S_cL 

The motivation for this definition is that if y G xS—, then 
all of the transverse divergence belongs to G-Gq — there is no 
G + -componcnt to any of the transverse divergence. For clarity, 
we emphasize that this restriction applies to all transverse diver- 
gence, not only the fastest transverse divergence. 

4) Combining (2) with the dilation provided by the translation a~ s 
shows, for a.e. x,y € T\G, that if y £ xG-, then y e xS- (see 
Cor. 5.5.2). 

5) A Lie algebra calculation shows that if y w x, and y — xg, with 
g € (G- Q S-)GoG + , then the transverse divergence of the U- 
orbits through x and y is fastest along some direction in G + (sec 
Lem. 5.5.3). 

6) Because the conclusions of (2) and (5) are contradictory, we see, 
for a.e. x, y £ T\G, that 

if x w y, then y ^ x(G- 9 S+)G G + 

(cf. Cor. 5.5.4). (Actually, a technical problem causes us obtain 
this result only for x and y in a set of measure 1 — e.) 

7) The relation between stretching and entropy (Prop. 2.5.11) pro- 
vides bounds on the entropy of a s , in terms of the the Jacobian 
of a s on U and (using (4)) the Jacobian of a~ s on 5_: 

J(a s ,U) < h^{a s ) < j(a- s ,S-). 
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On the other hand, the structure of sl(2, R)-modules implies 
that J(a s ,U) > j(a~ s ,S-). Thus, we conclude that h^(a s ) = 
J(a~ s , S-). This implies that 5_ C Stabc (it), so we must have 
S- = S- (see Prop. 5.6.1). 

8) By combining the conclusions of (6) and (7), we show that 
[i(xS-G G + ) > 0, for some x G T\G (see Prop. 5.7.1). 

9) By combining (8) with the (harmless) assumption that fi is not 
supported on an orbit of any closed, proper subgroup of G, we 
show that S- = G- (so 5_ is horospherical), and then there are 
a number of ways to show that S = G (see Thm. 5.7.2). 

The following several sections expand this outline into a fairly com- 
plete proof, modulo some details that are postponed to §5.8. 

5.2. Shearing and polynomial divergence 

As we saw in Chap. 1, shearing and polynomial divergence are 
crucial ingredients of the proof of Thm. 5.0.1. Precise statements will 
be given in §5.8, but let us now describe them informally. Our goal here 
is to prove that the direction of fastest divergence usually belongs to 
the stabilizer of it (see Prop. 5.2.4', which follows Cor. 5.2.5). This will 
later be restated in a slightly more convenient (but weaker) form (see 
Cor. 5.3.4). 

(5.2.1) Lemma (Shearing). If U is any connected, unipotent subgroup 
of G, then the transverse divergence of any two nearby U -orbits is 
fastest along some direction that is in the normalizer Nq(U). 

(5.2.2) Lemma (Polynomial divergence). If U is a connected, unipo- 
tent subgroup ofG, then any two nearby U -orbits diverge at polynomial 
speed. 

Hence, if it takes a certain amount of time for two nearby U -orbits 
to diverge to a certain distance, then the amount (and direction) of 
divergence will remain approximately the same for a proportional length 
of time. 

By combining these two results we will establish the following con- 
clusion (cf. Cor. 1.6.9). It is the basis of the entire proof. 

(5.2.3) Notation. Let S = Stab^ (it)° • This is a closed subgroup of G 
(see Exer. 1). 

(5.2.4) Proposition. If U is any connected, ergodic, unipotent sub- 
group of S, then there is a conull subset Q, of T\G, such that, for all 
x, y <G fl, with x s=a y, the U -orbits through x and y diverge fastest along 
some direction that belongs to S. 
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This immediately implies the following interesting special case of 
Ratner's Theorem (see Exer. 4), which was proved rather informally in 
Chap. 1 (see Prop. 1.6.10). 

(5.2.5) Corollary. IfU = Stab G (^) is unipotent (and connected), then 
fi is supported on a single V -orbit. 

Although Prop. 5.2.4 is true (see Exer. 5), it seems to be very dif- 
ficult to prove from scratch, so we will be content with proving the 
following weaker version that does not yield a conull subset, and im- 
poses a restriction on the relation between x and y (see 5.8.8). (See 
Exer. 5.8#5 for a non-infinitesimal version of the result.) 

(5.2.4') Proposition. For any 

• connected, ergodic, unipotent subgroup U of S, and 

• any e > 0, 

there is a subset Q e of T\G, such that 

1) /u(O e ) > 1 — e, and 

2) for all x,y € ri £; with x ~ y, and such that a certain technical 
assumption (5.8.9) is satisfied, the fastest transverse divergence of 
the U -orbits through x and y is along some direction that belongs 
to S. 

Proof (cf. Cor. 1.6.5). Let us assume that no Nc{U)-oib\t has positive 
measure, for otherwise it is easy to complete the proof (cf. Exer. 3). 
Then, for a.e. x € T\G, there is a point y ~ x, such that 

1) y i xN G (U), and 

2) y is a generic point for /i (see Cor. 3.1.6). 

Because y ^ xNq(U), we know that the orbit yU is not parallel 
to xll, so they diverge from each other. From Lem. 5.2.1, we know 
that the direction of fastest transverse divergence belongs to No(U), 
so there exist u, u' e U, and c e Nq{U) U, such that 

• yu' w (xu)c, and 

• ||c|| x 1 (i.e., ||c|| is finite, but not infinitesimal). 

Because c^U = Stabc(/x), we know that c*[i ^ /x. Because c <G No(U), 
this implies c*fi _L [i (see Exer. 6), so there is a compact subset K with 
/j,(K) > 1 - e and K n Kc = (see Exer. 7). 

We would like to complete the proof by saying that there are values 
of u for which both of the two points xu and yu' are arbitrarily close 
to K, which contradicts the fact that d(K, Kc) > 0. However, there are 
two technical problems: 

1) The set K must be chosen before we know the value of c. This 
issue is handled by Lem. 5.8.6. 
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2) The Pointwise Ergodic Theorem (3.4.3) implies (for a.e. x) that 
xu is arbitrarily close to K a huge proportion of the time. But this 
theorem does not apply directly to yv! , because v! is a nontrivial 
function of u. To overcome this difficulty, we add an additional 
technical hypothesis on the element g with y = xg (see 5.8.8). 
With this assumption, the result can be proved (see 5.8.7), by 
showing that the Jacobian of the change of variables u i— ► u' is 
bounded above and below on some set of reasonable size, and 
applying the uniform approximate version of the Pointwise Er- 
godic Theorem (see Cor. 3.4.4). The uniform estimate is what 
requires us to restrict to a set of measure 1 — e, rather than a 
conull set. □ 

(5.2.6) Remark. 

1) The fact that fl e is not quite conull is not a serious problem, 
although it does make one part of the proof more complicated 
(cf. Prop. 5.7.1). 

2) We will apply Prop. 5.2.4' only twice (in the proofs of Cors. 5.5.2 
and 5.5.4). In each case, it is not difficult to verify that the tech- 
nical assumption is satisfied (see Exers. 5.8#1 and 5.8#2). 

Exercises for §5.2. 

#1. Show that Stabc(/i) is a closed subgroup of G. 

[Hint: g € Stabc(/i) if and only if J f(xg) dg,(x) = J f dfj, for all con- 
tinuous functions / with compact support.] 

#2. Suppose 

• v is a (finite or infinite) Borel measure on G, and 

• N is a unimodular, normal subgroup of G. 

Show that if v is right-invariant under N (that is, v(An) = v(A) 
for all n € N), then v is left-invariant under N. 

#3. Show that if 

• N is a unimodular, normal subgroup of G, 

• N is contained in Stabc(/u), and 

• N is ergodic on T\G, 
then n is homogeneous. 

[Hint: Lift fj, to an (infinite) measure ft on G, such that ft is left in- 
variant under F, and right invariant under N. Exercise 2 implies that 
fi is left invariant (and ergodic) under the closure H of TN. Ergodicity 
implies that ft is supported on a single H-orbit.] 

#4. Prove Cor. 5.2.5 from Prop. 5.2.4 and Exer. 3. 

[Hint: If n(xN G (U)) > 0, for some x £ r\G, then Exer. 3 (with 
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Ng{U) in the place of G) implies that fi is homogeneous. Otherwise, 
Prop. 5.2.4 implies that Stab G (^) \ U ± 0.] 

#5. Show that Thm. 5.0.1 implies Prop. 5.2.4. 

#6. Show that if 

• ii is {/-invariant and ergodic, and 

• c€N G (U), 
then 

(a) c*\i is JJ-invariant and ergodic, and 

(b) either c*\i = \i or c*^ _L [i. 

#7. Suppose 

• e > 0, 

• \i is {/-invariant and ergodic, 

• c e N G (U), and 

• c*/i _L /i. 

Show that there is a compact subset K of T\G, such that 
(a) n(K) > 1 — e, and 
{b)KC\Kc = $. 

5.3. Assumptions and a restatement of 5.2.4' 

(5.3.1) Assumption. As mentioned in Assump. 5.0.3, we assume there 
exist 

• a closed subgroup L of G and 

• a (nontrivial) one-parameter subgroup {a s } of L, 
such that 

1) {u*} c L, 

2) {a s } is hyperbolic, and normalizes {«*}, 

3) \x is invariant under {a s }, and 

4) L is locally isomorphic to SL(2,R). 

(5.3.2) Remark. 

1) Under an appropriate local isomorphism between L and SL(2, R), 
the subgroup (a s , u l ) maps to the group B2U2 of lower triangular 
matrices in SL(2,K) (see Exer. 4.9#1). 

2) Therefore, the parametrizations of a s and w* can be chosen so 
that a^ s u l a s — u e for all s and i. 

3) The Mautner Phenomenon implies that the measure fi is ergodic 
for {a s } (see Cor. 3.2.5). 

(5.3.3) Notation. 
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• For a (small) element g of G, we use g to denote the corresponding 
element log g of the Lie algebra Q. 

• Recall that S = Stab G (/i)° (sec Notn. 5.2.3). 

• By renormalizing, let us assume that [u,a\ = 2u (where a = a 1 
and u = u 1 ). 

• Let {v r } be the (unique) one-parameter unipotent subgroup of L, 
such that \v,a\ = —2v and = a (see Eg. 4.9.11). 

• Let © AeZ 0a be the decomposition of Q into weight spaces of a: 
that is, 

0a = {g e | [g,a] = Xg} . 

• Let Q+ = © A>0 Qx,Q-= © A<0 Qx, 3+ = 5 n 0+, 5_ = s n fl- , 
and S = 5 n 0o ■ Then 

= 0- +00 + 0+ and S = S_ +5 +S+. 

These are direct sums of vector spaces, although they are not 
direct sums of Lie algebras. 

• Let G + ,G-,Gq,S + ,S-,So be the connected subgroups of G cor- 
responding to the Lie subalgebras + ,0_,0 Oj S + ,S_,So, respec- 
tively (see Exer. 1). 

• Let U = S + (and let U be the Lie algebra of U). 

Because S-SqU = S-SoS+ contains a neighborhood of e in S (see 
Exer. 2), Prop. 5.2.4' states that the direction of fastest transverse 
divergence belongs to S-So- The following corollary is a priori weaker 
(because G_ and Go are presumably larger than S- and So), but it is 
the only consequence of Lem. 5.8.6 or Lcm. 5.2.1 that we will need in 
our later arguments. 

(5.3.4) Corollary. For any e > 0, there is a subset Cl e of T\G, such 
that 

1) /Li(Q e ) > 1 — e ; and 

2) for all x,y <G fl e , with x w y, and such that a certain technical 
assumption (5.8.9) is satisfied, the fastest transverse divergence of 
the U -orbits through x and y is along some direction that belongs 
toG_G . 

Exercises for §5.3. 

#1. Show g + , £J_, and Q are subalgebras of Q. 

[Hint: [0Ai,0A 2 ] C 0a 1+ a 2 -] 
#2. Show S_SoS+ contains a neighborhood of e in S. 

[Hint: Because 5- + So + S-+ = S, this follows from the Inverse 

Function Fheorem.] 
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5.4. Definition of the subgroup S 

To exploit Cor. 5.3.4, let us introduce some notation. The corollary 
states that orbits diverge fastest along some direction in G_Go, but it 
will be important to understand when all of the transverse divergence, 
not just the fastest part, is along G_Go- More precisely, we wish to 
understand the elements g of G, such that if y = xg, then the orbits 
through x and y diverge transversely only along directions in G_Gq: 
the G+-component of the relative motion should belong to U , so the 
G + -component of the divergence is trivial. Because the divergence is 
measured by u~ 1 gu (thought of as an element of G/U), this suggests 
that we wish to understand 

u~ 1 gu £ G_G t/, 
Vu £ some neighborhood of e in U 

This is the right idea, but replacing G-GqU with its Zariski closure 
G-GqU yields a slightly better theory. (For example, the resulting 
subset of G turns out to be a subgroup!) Fortunately, when g is close 
to e (which is the case we are usually interested in), this alteration of 
the definition makes no difference at all (see Exer. 10). (This is because 
G-GqU contains a neighborhood of e in G-Goll (see Exer. 6).) Thus, 
the non-expert may wish to think of G-GqU as simply being G-GqU, 
although this is not strictly correct. 

(5.4.1) Definition. Let 

S={geG u-^gu £ GT^F, for all u £ U } 

and 

S- =snG— 

It is more or less obvious that S C S (see Exer. 4). Although this is 
much less obvious, it should also be noted that S is a closed subgroup 
of G (sec Exer. 8). 

(5.4.2) Remark. Here is an alternate approach to the definition of S, 
or, at least, its identity component. 

1) Let 

S = { g £ | g(a,du) k £ $J_ + g + U, Vfc > 0, Vu £ U } . 

Then S is a Lie subalgebra of Q (see Exer. 11), so we may let S° 
be the corresponding connected Lie subgroup of G. (We will see 
in (3) below that this agrees with Dcfn. 5.4.1.) 

2) From the point of view in (1), it is not difficult to see that S° is 
the unique maximal connected subgroup of G, such that 

(a) S°nG+ = U, and 
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(b) 5° is normalized by a 1 

(see Exers. 12 and 13). This makes it obvious that S C S° . It is 
also easy to verify directly that S C S (see Exer. 14). 

3) It is not difficult to see that the identity component of the sub- 
group defined in Defn. 5.4.1 is also the subgroup characterized 
in (2) (see Exer. 15), so this alternate approach agrees with the 
original definition of S. 

(5.4.3) Example. Remark 5.4.2 makes it easy to calculate S° . 

1) We have S = G if and only if U = G+ (see Exer. 16). 

2) If 
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5) If 

G = SL(2,] 
and 



xSL(2,R), a- 
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0+ = 

(see Exer. 20). 

Exercises for §5.4. 

#1. Show that if 

• V is any subgroup of G+ (or of G_), and 

• V is normalized by {a*}, 
then V is connected. 

[Hint: If v € G+, then a~ t va t -»easi^ — oo.] 
#2. Show that if H is a connected subgroup of G, and H is normalized 

by {a*}, then H C H-H Q H+. 

[Hint: dim THTh #+ = dim H. Use Exer. 4.1#11.] 
#3. Show, directly from Dcfn. 5.4.1, that N G _{U) C 5_. 

#4. Show, directly from Dcfn. 5.4.1, that S C 5. 
[ffinf: Use Exer. 2.] 

5* 



#5. Let G = SL(2, 



and U = G+ = 



(a) Show that G_G G+ ^ G. 

(b) For g e G, show that if u^ x gu e G-GqII, for all u e U, then 
5 e G [/. 

(c) Show, for all g & G, and all u e C/, that u~ 1 gu e G-GqU. 
Therefore S = G. 

[Hint: Letting w = (0,1), and considering the usual representation 
of G on M 2 , we have 17 = StabG(v). Thus, G/U may be identified with 
R 2 \ {0}. This identifies G-GqU/U with {(x,y) G R 2 | x > 0}.] 

#6. Assume G is almost Zariski closed (see Rem. 5.0.5). Define the 
polynomial ip: G_Go x G+ — > G_GoG+ by ip(g,u) = gu. (Note 
that G-GoG+ is an open subset of G (cf. Exer. 5.3#2).) Assume 
the inverse of ip is rational (although we do not prove it, this is 
indeed always the case, cf. Exer. 5). 
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Show that G-GqU is an open subset of G-GqU. 
[Hint: G~GoU is the inverse image of U under a rational map 
VI 1 : G-G G+ ->G+.] 

#7. (a) Show that if 

• V is a Zariski closed subset of SL(£,R), 

• g e SL(£,R), and 

then Ug = V. 

(b) Show that if F is a Zariski closed subset of SL(£, K), then 

{ 5 eSL(^,R) |Vscn 

is a closed subgroup of SL(£,R). 

(c) Construct an example to show that the conclusion of (7b) 
can fail if V is assumed only to be closed, not Zariski closed. 

[Hint: Use Exer. 4.1#11.] 

#8. Show, directly from Defn. 5.4.1, that S is a subgroup of G. 
[Hint: Show that 

S={ 5 gg| G-G Ug C = G-G u} , 

and apply Exer. 7b.] 
#9. Show that if g £ S, then 



{ueU\ u- x gu e G-G U} 

is nowhere dense in U. That is, its closure does not contain any 
open subset of U. 

[Hint: It is a Zariski closed, proper subset of U.] 
#10. Show that there is a neighborhood W of e in G, such that 

SHW^LeW u^gueG-GoU, 1 

y Vug some neighborhood of e in U J 

[Hint: Use Exers. 9 and 6.] 
#11. Show, directly from the definition (see 5.4.1), that 

(a) S is invariant under ad a, and 

(b) S is a Lie subalgebra of Q. 

[Hint: If g i G S Xl , g_ 2 G S\ 2 , u G Ua 3 , and Ai + A 2 + (fci + k 2 )\ 3 > 0, 
then g^adu) * G U, for some i G {1, 2}, so 

[^(adu)* 1 , 5 2 (adu) fc2 ] G g_ + flo + U, 
and it follows that S is a Lie subalgebra.] 
#12. Show, directly from Defn. 5.4.1, that 
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(a) S n G+ = U, and 

(b) S is normalized by {a*}. 

[Hint: It suffices to show that S+ = U (see Exer. 1), and that S is 
(Ad<3 a*)-invariant.] 
#13. Show, directly from Defn. 5.4.1, that if H is any connected sub- 
group of G, such that 

(a) HDG+ = U, and 

(b) H is normalized by {a*}, 
then H c S. 

[Hint: It suffices to show that f) C S.] 
#14. Show, directly from the definition of S in Rem. 5.4.2(1), that 

s c s. 

#15. Verify, directly from Defn. 5.4.1 (and assuming that S is a sub- 
group), 

(a) that S satisfies conditions (a) and (b) of Rem. 5.4.2(2), and 

(b) conversely, that if H is a connected subgroup of G, such that 
H n G + = U and H is normalized by {a*}, then H C S. 

#16. Verify Eg. 5.4.3(1). 
#17. Verify Eg. 5.4.3(2). 
#18. Verify Eg. 5.4.3(3). 
#19. Verify Eg. 5.4.3(4). 
#20. Verify Eg. 5.4.3(5). 

5.5. Two important consequences of shearing 

Our ultimate goal is to find a conull subset £1 of T\G, such that 
if x, y <G 0, then y e xS. In this section, we establish two conse- 
quences of Cor. 5.3.4 that represent major progress toward this goal (see 
Cors. 5.5.2 and 5.5.4). These results deal with S, rather than S, but that 
turns out not to be a very serious problem, because S n G + = S n G + 
(see Rem. 5.4.2(2)) and Sn G_ = Sn G_ (sec Prop. 5.6.1). 

(5.5.1) Notation. Let 

• Q + U be an aMnvariant complement to U in Q + , 

• g_ 0S_ be an aMnvariant complement to S_ in 0_, 

• G+ e u = cx P (g_ eu), 

and 

• G_ e S- = exp($J^ 9S-). 

Note that the natural maps (G+9?7) x [/ — > G+ and (G-QS-) xS_^ 
G_ (denned by (g,h) t— > 3/1) are diffeomorphisms (see Exer. 1). 
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(5.5.2) Corollary. There is a conull subset £1 of T\G, such that if 
x, y € O, and y G xG- , then y G xS*- . 

Proof. Choose fio as in the conclusion of Cor. 5.3.4. From the Pointwise 
Ergodic Theorem (3.1.3), we know that 

= | x e T\G { i G K+ | xa* G ft } is unbounded | 

is conull (see Exer. 3.1#3). 

We have y = xg, for some g G G-. Because a~ t ga t — > e as t — > 
co, we may assume, by replacing x and y with xa and ya* for some 
infinitely large t, that g is infinitesimal (and that x,y G Oo)- (See 
Exer. 5.8#6 for a non-infinitesimal version of the proof.) 

Suppose g ^ 5_ (this will lead to a contradiction). From the defini- 
tion of S- , this means there is some u G U, such that u~ 1 gu £ G-GqU : 
write u~ 1 gu = hcu' with h G G-Gq, c G G+ G £/, and m' G U. We may 
assume h is infinitesimal (because we could choose u to be finite, or 
even infinitesimal, if desired (see Exer. 5.4#9)). Translating again by 
an (infinitely large) element of {a*}, with t > 0, we may assume c is 
infinitely large. Because /i is infinitesimal, this clearly implies that the 
orbits through x and y diverge fastest along a direction in G + , not a 
direction in G-Gq. This contradicts Cor. 5.3.4. (See Exer. 5.8#1 for a 
verification of the technical assumption (5.8.9) in that corollary.) □ 

An easy calculation (involving only algebra, not dynamics) estab- 
lishes the following. (See Exer. 5.8#7 for a non-infinitesimal version.) 

(5.5.3) Lemma. If 

• y = xg with 

• g G (G_ e 5_)G G+, and 

• 9 ~ e, 

£/ien f/ie transverse divergence of the U -orbits through x and y is fastest 
along some direction in G + . 

Proof. Choose s > (infinitely large), such that g = a s ga~ s is finite, 
but not infinitesimal, and write g — 5-.<?o.9+, with c/_ G G_, go G Go, 
and g+ G G+. (Note that go and g + arc infinitesimal, but <)_ is not.) 
Because c/_ G G_ S- , we know that g is not infinitely close to S- , so 
there is some finite u G U, such that is not infinitesimally close 

to G-GoU. 

Let u = a~ s ua s , and consider u~ 1 gu — a~ s (u~ 1 gu)a s . 

• Because u^gu is finite (since u and y are finite), we know that 
each of and (u~\gu)o is finite. Therefore (u _1 .gu)- and 
(u~\gw)o are finite, because conjugation by a s does not expand 
G- or G . 
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• On the other hand, we know that (u~ 1 gu) + is infinitely far 
from U, because the distance between and U is not in- 

finitesimal, and conjugation by a s expands G + by an infinite 
factor. 

Therefore, the fastest divergence is clearly along a direction in G+ . □ 

The conclusion of the above lemma contradicts the conclusion of 
Cor. 5.3.4(2) (and the technical assumption (5.8.9) is automatically 
satisfied in this situation (see Exer. 5.8#2)), so we have the following 
conclusion: 

(5.5.4) Corollary. For any e > 0, there is a subset £7 e of T\G, such 
that 

1) jU(O e ) > 1 - e, and 

2) for all !,!/£ Q, € , with x ~ y, we have y ^ x(G- Q S-)GqG+. 

This can be restated in the following non-infinitesimal terms (see 
Exer. 5.8#8): 

(5.5.4') Corollary. For any e > 0, there is a subset n e of T\G, and 
some S > 0, such that 

1) /x(f2 e ) > 1 — e, and 

2) for all x, y € Q e , with d(x, y) < 5, we have y £ x(G-GS-)GoG + . 

Exercise for §5.5. 

#1. Show that if t) and It) are two complementary a*-invariant sub- 
spaces of Q + , then the natural map exp x exp to — > G + , defined 
by (v,w) i— > vw, is a diffcomorphism. 

[Hint: The Inverse Function Theorem implies that the map is a local 
diffeomorphism near e. Conjugate by a s to expand the good neighbor- 
hood.] 

5.6. Comparing S- with S- 

We will now show that S- = 5_ (see Prop. 5.6.1). To do this, we 
use the following lemma on the entropy of translations on homogeneous 
spaces. Corollary 5.5.2 is what makes this possible, by verifying the 
hypotheses of Lcm. 2.5.11'(2), with W = S— 

(2.5.11') Lemma. Suppose W is a closed, connected subgroup of G- 
that is normalized by a, and let 

J(a _1 ,W) - dct((Ada- 1 )| ro ) 

be the Jacobian of a -1 on W. 

1) If [i is W -invariant, then /i M (a) > log J(a _1 , W). 
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2) If there is a conull, Borel subset Q, ofT\G, such that ClHxG- C 
xW, for every x £ 0, then h fl (a) < log J(a~ 1 , W). 

3) If the hypotheses of (2) are satisfied, and equality holds in its 
conclusion, then fi is W -invariant. 

(5.6.1) Proposition. We have S- = 5-. 

Proof (cf. proofs of Cors. 1.7.6 and 1.8.1). We already know that 
S- D S- (see Rem. 5.4.2(2)). Thus, because S_ C G-, it suffices to 
show that S- C S. That is, it suffices to show that fi is S- -invariant. 

From Lem. 2.5.11'(1), with a -1 in the role of a, and U in the role 
of W, we have 

MO > log J(a,U). 
From Cor. 5.5.2 and Lem. 2.5.11'(2), we have 

h^(a) < log J(a~ 1 ,S-). 

Combining these two inequalities with the fact that h^(a) = h fJi (a^ 1 ) 
(see Exer. 2.3#7), we have 

log J(a,U) <h ll {cT 1 ) = h„(a) < log J(a~\ 5_). 

Thus, if we show that 

logJ(a- 1 ,5_) < log J(a,U), (5.6.2) 

then we must have equality throughout, and the desired conclusion will 
follow from Lem. 2.5.11'(3). 

Because u belongs to the Lie algebra [ of L (see Notn. 5.3.3), the 
structure of sl(2, K)-modules implies, for each A £ Z+, that the re- 
striction (ad u) A |g_ A is a bijection from the weight space 0_a onto 
the weight space 0a (see Exer. 1). If g £ S- n 0-a, then Rem. 5.4.2(1) 
implies g(ad u) x £ (g_ + 0o + u) n 0a = U n 0a, so we conclude that 
(ad u) X \s_ng- x * s an embedding of5_ n 0-a into U H 0a- So 



dim(S_ n 0_a) < dim(u n 0a). 
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The eigenvalue of Adc a — exp(ad fl a) on Q\ is e A , and the eigenvalue 
of Ado a -1 on Q_\ is also e A (see Exer. 2). Hence, 

log J(a-\S_) =logdct(Ad Ga - 1 )| 5 _ 

= log 11 ( e ^)dim(5-n _ A ) 

= (dim(S_ n 0_ A )) -loge A 

Aez+ 

< ^ (dimun g A ) ■ loge A 

AGZ+ 

= log J{a,U), 

as desired. □ 

Exercises for §5.6. 

#1. Suppose 

• S is a finite-dimensional real vector space, and 

• p: sl(2,R) — ► s[(<S) is a Lie algebra homomorphism. 
Show, for every m e Z-°, that (u'')" 1 is a bijection from S- m 
to 5 m . 

[Hint: Use Prop. 4.9.22. If —A, < 2j — \ t < 0, then Wij(u p ) Xi ~ 2j is a 
nonzero multiple of Wi^-j, and Wi,x i -j{'v_ p ) Xi ^ 2i is a nonzero multiple 

Of 

#2. In the notation of the proof of Prop. 5.6.1, show that the eigen- 
value of Adca^ 1 on is the same as the the eigenvalue of 
Ad G a on g A - 

5.7. Completion of the proof 

We wish to show, for some x € T\G, that /j,(xS) > 0. In other 
words, that fj,(xS-SoS + ) > 0. The following weaker result is a crucial 
step in this direction. 

(5.7.1) Proposition. For some x e T\G, we have p(xS-GoG + ) > 0. 

Proof. Assume that the desired conclusion fails. (This will lead to a 
contradiction.) Let fl e be as in Cor. 5.5.4, with e sufficiently small. 

Because the conclusion of the proposition is assumed to fail, there 
exist x, y <G fl e , with x w y and y = xg, such that g £ S-GqG + . (See 
Exer. 2 for a non-infinitesimal proof.) Thus, we may write 

g = vwh with u£S_,to£ (G_ S-) \ {e}, and h £ G$G+. 

For simplicity, let us pretend that fi e is S- -invariant. (This is not so 
far from the truth, because \x is SL -invariant and fi(fl e ) is very close 
to 1, so the actual proof is only a little more complicated (see Exer. 1).) 
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Then we may replace x with xv, so that g = wh e (G_ S-)GqG+. 
This contradicts the definition of £! £ . □ 

We can now complete the proof (using some of the theory of alge- 
braic groups). 

(5.7.2) Theorem, /x is supported on a single S-orbit. 

Proof. There is no harm in assuming that G is almost Zariski closed 
(see Rem. 5.0.5). By induction on dimG, we may assume that there 
does not exist a subgroup H of G, such that 

• H is almost Zariski closed, 

• U C H, and 

• some iJ-orbit has full measure. 

Then a short argument (see Exer. 4.7#8) implies, for all x € T\G, that 

if V is any subset of G, 
such that fi(xV) > 0, then G C V. 

This hypothesis will allow us to show that S = G. 

Claim. We have S- = G_. Prop. 5.7.1 states that h(xS-GqG+) > 0, 

so, from (5.7.3), we know that G C S-GoG + . This implies that 
S'_GoG + must contain an open subset of G (see Exer. 5.4#6). There- 
fore 

dimS- > dimG - dim(G G+) = dimG_. 

Because S- C G_, and G_ is connected, this implies that SL = G_, 
as desired. 

The subgroup G_ is a horospherical subgroup of G (see Rem. 2.5.6), 
so we have shown that \x is invariant under a horospherical subgroup 
of G. 

There are now at least three ways to complete the argument. 

a) We showed that \i is G_ -invariant. By going through the same 
argument, but with v r in the place of u*, we could show that ji is 
G + -invariant. So S contains (L, G + , G_), which is easily seen to 
be a (unimodular) normal subgroup of G (see Exer. 3). Then 
Exer. 5.2#3 applies. 

b) By using considerations of entropy, much as in the proof of 
Prop. 5.6.1, one can show that G + C S (see Exer. 4), and then 
Exer. 5.2#3 applies, once again. 

c) If we assume that T\G is compact (and in some other cases), 
then a completely separate proof of the theorem is known for 
measures that are invariant under a horospherical subgroup. (An 
example of an argument of this type appears in Exers. 3.2#8 
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and 5.) Such special cases were known several years before the 
general theorem. □ 

Exercises for §5.7. 

#1. Prove Prop. 5.7.1 (without assuming O e is ^--invariant). 

[Hint: Because f2 e contains 99% of the S^-orbits of both x and y, 

it is possible to find x' G xS- n O e and y' G yS- n f2 e , such that 

y' €x'{G-eS-)G G+.} 
#2. Prove Prop. 5.7.1 without using infinitesimals. 

[Hint: Use Cor. 5.5.4'.] 

#3. Show that (G-,G + ) is a normal subgroup of G°. 

[Hint: It suffices to show that it is normalized by G-, Go, and G+.] 

#4. (a) Show that J( a -\ G_) = J{a, G+). 

(b) Use Lem. 2.5.11' (at the beginning of §5.6) to show that if /x 
is G_-invariant, then it is G+-invariant. 

#5. Let 

• G be a connected, semisimple subgroup of SL(^,R), 

• r be a lattice in G, such that T\G is compact, 

• ji be a probability measure on T\G, 

• a s be a nontrivial hyperbolic one-parameter subgroup of G, 
and 

• G+ be the corresponding expanding horospherical subgroup 
ofG. 

Show that if \i is G + -invariant, then \x is the Haar measure on 

r\G. 

[Hint: Cf. hint to Exer. 3.2#8. (Let U € C G+, A € C G , and V € C G_.) 
Because /i is not assumed to be a s -invariant, it may not be possible 
to choose a generic point ?/ for /i, such that j/a Sfc — » j/. Instead, show 
that the mixing property (3.2.8) can be strengthened to apply to the 
compact family of subsets { ylleA^V^ \ y G L\G }.] 

5.8. Some precise statements 

Let us now state these results more precisely, beginning with the 
statement that polynomials stay near their largest value for a propor- 
tional length of time. 

(5.8.1) Lemma. For any d and e, and any averaging sequence {E n } of 
open sets in any unipotent subgroup U of G, there is a ball B around e 
in U, such that if 

• f: U — > W n is any polynomial of degree < d, 

• E n is an averaging set in the averaging sequence {E n }, and 
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• su P«£E n ll/WII < 1 > 



i/ien ||/(i>iWi>2) — /(u)|| < e, for all u G £7„, and a// vi,W2 £ B n = 



(5.8.2) Remark. Note that vu(B n ) / vu(E n ) = v v (B)/v v (E) is in- 
dependent of n; thus, _B„ represents an amount of time proportional 
to E n . 

Proof. The set S d of real polynomials of degree < d is a finite- 
dimensional vector space, so 



is compact. Thus, there is a ball B around e in U, such that the con- 
clusion of the lemma holds for n = 0. Rescaling by a n then implies that 



As was noted in the previous chapter, if y = xg, then the relative 
displacement between xu and yu is u~ 1 gu. For each fixed g, this is 
a polynomial function on U , and the degree is bounded independent 
of g. The following observation makes a similar statement about the 
transverse divergence of two {/-orbits. It is a formalization of Lem. 5.2.2. 

(5.8.3) Remark. Given y — xg, the relative displacement between 
xu and yu is u~ 1 gu. To measure the part of this displacement that is 
transverse to the [/-orbit, we wish to multiply by an element u' of U, to 
make (u~ 1 gu)u l as small as possible: cquivalcntly, we can simply think 
of u~ x gu in the quotient space G/U. That is, 

the transverse distance between the two U -orbits (at the 
point xu) is measured by the position of the point (u~ 1 gu)U = 
u~ 1 gll in the homogeneous space G/U. 
Because U is Zariski closed (see Prop. 4.6.2), we know, from Chevalley's 
Theorem (4.5.2), that, for some to, there is 

• a polynomial homomorphism p: G — > SL(m,K), and 

• a vector w <G M m , 

such that (writing our linear transformations on the left) we have 



Thus, we may identify G/U with the orbit wG, and, because p is a poly- 
nomial, we know that u p(u~ 1 gu)w is a polynomial function on U. 
Hence, the transverse distance between the two [/-orbits is completely 
described by a polynomial function. 

We now make precise the statement in Lem. 5.2.1 that the direc- 
tion of fastest divergence is in the direction of the normalizer. (See 
Rem. 5.8.5 for a non-infinitesimal version of the result.) 



a 



- n Ba n . 




it must also hold for any n. 



□ 



U = { u G G | p(u)w = w }. 
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(5.8.4) Proposition. Suppose 

• U is a connected, unipotent subgroup of G, 

• x,y e T\G, 

• y = xg, for some g e G, with g w e, and 

• E is an {infinitely large) averaging set, 
such that 

• g E — { u~ x gu | u e E } has finite diameter in G/U. 

Then each element of g E U/U is infinitesimally close to some element 
ofN G (U)/U. 

Proof. Let g' e g E . Note that 

N G (U)/U = {xE G/U \ux = x, for all ueU}. 

Thus, it suffices to show that ug'U s=s g'U, for each finite u € U. 

We may assume g'U is a finite (not infinitesimal) distance from the 
base point eU, so its distance is comparable to the farthest distance 
in g E U/U. It took infinitely long to achieve this distance, so polynomial 
divergence implies that it takes a proportional, hence infinite, amount 
of time to move any additional finite distance. Thus, in any finite time, 
the point g'U moves only infinitesimally. Therefore, ug'U ss g'U, as 
desired. □ 

(5.8.5) Remark. The above statement and proof are written in terms 
of infinitesimals. To obtain a non- infinitesimal version, replace 

• x and y with convergent sequences {xk} and {yk}, such that 
d(x k ,y k ) — > e, 

• g with the sequence defined by x k gk — yk, and 

• E with an averaging sequence E n , such that g k k is bounded in 
G/U (independent of k). 

The conclusion is that if {g' k } is any sequence, such that 

E 

• 9k e 9k " k f° r eacn an d 

• 9 k U/U converges, 

then the limit is an element of Nq(U)/U (see Exer. 3). 

(5.8.6) Lemma. If 

• C is any compact subset of Nq{U) \ Stabc(/i), and 

• e > 7 

then there is a compact subset K of T\G, such that 

1) fi(K) > 1 - e and 

2) K n Kc = 0, for all ceC. 
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Proof. Let fi be the set of all points in T\G that are generic for [i (see 
Defn. 3.1.5 and Thm. 3.4.3). It suffices to show that fl n flc = 0, for all 
c e Nq(U) \ Stabc(^), for then we may choose K to be any compact 
subset of O with /j,(K) > 1 — e. 

Fix c £ C. We choose a compact subset K c of f2 with K c r\K c c = 0, 
and n{K c ) > 1 — S, where 6 depends only on c, but will be specified 
later. 

Now suppose x, xc € f2. Except for a proportion 5 of the time, we 
have xu very near to K c (because x € O c ). Thus, it suffices to have 
xuc very close to K c more than a proportion 5 of the time. That is, we 
wish to have (a;c)(c~ 1 uc) very close to K c a significant proportion of 
the time. 

We do have (xc)u very close to K c a huge proportion of the time. 
Now c acts on U by conjugation, and the Jacobian of this diffeomor- 
phism is constant (hence bounded), as is the maximum eigenvalue of 
the derivative. Thus, we obtain the desired conclusion by choosing S suf- 
ficiently small (and E to be a nice set) (see Exer. 4). □ 

(5.8.7) Completing the proof of Prop. 5.2.4'. Fix a set f2 as in 

the Uniform Pointwise Ergodic Theorem (3.4.4). Suppose x,y e flo 
with x y, and write y = xg. Given an (infinite) averaging set E n = 
a~ n Ea n , such that g E " is bounded in G/U, and any v <G E, we wish 
to show that (a~ n va n )gU is infinitesimally close to Stabci^/U. The 
proof of Prop. 5.2.4' will apply if we show that yu' is close to if a 
significant proportion of the time. 

To do this, we make the additional technical assumption that 

g* = a n ga~ n is finite (or infinitesimal). (5.8.8) 

Let us assume that E is a ball around e. Choose a small neighborhood B 
of v in E, and define 

a: B-> U by ug* <G (GQU) ■ a(u), for u € B, 

so 

u' = a- n a{u)a n . 

The Jacobian of a is bounded (between 1/J and J, say), so we can 
choose e so small that 

(1- J 2 e)-vu{B) >e-uu{E). 

(The compact set K should be chosen with /j,(K) > 1 — e.) 

By applying Cor. 3.4.4 to the averaging sequence a(B) n (and noting 
that n is infinitely large) , and observing that 

y(a- n ua n ) = x{a~ n g*ua n ), 



5.8. Some precise statements 



181 



we see that 

vu({u€ a(B) | x(a- n g*ua n ) 56 K}) < ev v {B n ). 
Therefore, the choice of e implies 

vu({u e B \ x(a- n g* <r{u) a n ) ~K}) > ev v {E). 

Because 

x(a- n g* a(u) a n ) = x {aT n g* a 11 ) {aT n o{u) a 11 ) 
= xgv! 

= yu', 

this completes the proof. □ 

(5.8.9) Technical assumption. 

1) The technical assumption (5.8.8) in the proof of Prop. 5.2.4' can 
be stated in the following explicit form if g is infinitesimal: there 
are 

• an (infinite) integer n, and 

• a finite element uq of U, 
such that 

(a) a- n u a n g G G_G G+, 

(b) a~ n uoa n gU is not infinitesimally close to ell in G/U, and 

(c) a n ga~ n is finite (or infinitesimal). 

2) In non-infinitesimal terms, the assumption on is: there are 

• a sequence — > 00, and 

• a bounded sequence {ut} in U, 
such that 

(a) a- n "u k a n "g k e G_G G+, 

(b) no subsequence of a~ nk Uka nk gkU converges to elf in G/U, 
and 

(c) a nk gkoT nh is bounded. 

Exercises for §5.8. 

#1. Show that if g = a^ t va t , for some standard v G G_, and g is 
infinitesimal, then either 

(a) g satisfies the technical assumption (5.8.9), or 

#2. Show that if g is as in Lem. 5.5.3, then g satisfies the technical 
assumption (5.8.9). 

[Hint: Choose n > so that a n ga~ n is finite, but not infinitesimal. 
Then a n ga~ n is not infinitesimally close to S, so there is some (small) 
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u G U, such that u(a n ga n ) is not infinitesimally close to G-GqU. 
Conjugate by a n .] 

#3. Provide a (non-infinitcsimal) proof of Rem. 5.8.5. 

#4. Complete the proof of Lem. 5.8.6, by showing that if E is a convex 
neighborhood of e in U, and S is sufficiently small, then, for all n 
and every subset X of E n with vu{X) > (1 — S)vu(E n ), we have 

%({w£ | c _1 uc el})> 

[Hint: There is some fc > 0, such that 

c~ 1 E n -kC C i?n, for all n. 

Choose 5 small enough that 

v>u{E n - k ) > {J +\)8v u {E n ), 

where J is the Jacobian of the conjugation diffeomorphism.] 

#5. Prove the non-infinitcsimal version of Prop. 5.2.4': For any e > 0, 
there is a compact subset f2 e of T\G, with fi(fl e ) > 1 — e, and 
such that if 

• {x k } and {yfc} are convergent sequences in Q e , 

• {g k } is a sequence in G that satisfies 5.8.9(2), 

• x k g k = y k , 

• 9k — > e, 

• {-E„} is an averaging sequence, and {n k } is a sequence of 
natural numbers, such that g k " k is bounded in G/U (inde- 
pendent of k), 

• 9k G #f " fe , and 

• g' k U/U converges, 

then the limit of {g' k U} is an element of S/U . 
#6. Prove Cor. 5.5.2 without using infinitesimals. 
#7. Prove the non-infinitesimal version of Lem. 5.5.3: If 

• {g n } is a sequence in (G_ S'_)GoG + , 

• {E n } is an averaging sequence, and {n k } is a sequence of 

natural numbers, such that g k " k is bounded in G/U (inde- 
pendent of k), 

• g' k G fff , and 

• g' k U/U converges, 

then the limit of {g^C/} is an element of G+/U. 
#8. Prove Cor. 5.5.4'. 
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5.9. How to eliminate Assumption 5.3.1 

Let U be a maximal connected, unipotent subgroup of S, and as- 
sume {«'} C U. 

From Rem. 5.8.3, we know, for x, y € T\G with x w y, that the 
transverse component of the relative position between xu and is 
a polynomial function of u. (Actually, it is a rational function (cf. 
Exer. Id), but this technical issue does not cause any serious problems, 
because the function is unbounded on U, just like a polynomial would 
be.) Furthermore, the transverse component belongs to S (usually) and 
normalizes U (see Props. 5.2.4' and 5.8.4). Let S be the closure of the 
subgroup of N S (U) that is generated by the image of one of these poly- 
nomial maps, together with U. Then S is (almost) Zariski closed (see 
Lcm. 4.6.9), and the maximal unipotent subgroup U is normal, so the 
structure theory of algebraic groups implies that there is a hyperbolic 
torus T of S, and a compact subgroup C of S, such that S = TCU 
(see Thm. 4.4.7 and Cor. 4.4.10(3)). Any nonconstant polynomial is 
unbounded, so (by definition of S), we see that S/U is not compact; 
thus, T is not compact. Let 

• {a s } be a noncompact one-dimensional subgroup of T, and 

• U = S+. 

This does not establish (5.3.1), but it comes close: 

• [i is invariant under {a s }, and 

• {a s } is hyperbolic, and normalizes U . 

We have not constructed a subgroup L, isomorphic to SL(2,R), that 
contains {a s }, but the only real use of that assumption was to prove 
that J(a~ s , S-) < J(a s , S + ) (see 5.6.2). Instead of using the theory of 
SL(2, R)-modules, one shows, by using the theory of algebraic groups, 
and choosing {a s } carefully, that J(a s ,H) > 1, for every Zariski closed 
subgroup H of G that is normalized by a s U (see Exer. 2). 

An additional complication comes from the fact that a s may not 
act ergodically (w.r.t. /i): although u* is ergodic, we cannot apply the 
Mautner Phenomenon, because U = <S+ may not contain {u*} (since 
(u*)_ or (w')o may be nontrivial). Thus, one works with ergodic compo- 
nents of /i. The key point is that the arguments establishing Prop. 5.6.1 
actually show that each ergodic component of jj, is S_ -invariant. But 
then it immediately follows that [i itself is ^--invariant, as desired, so 
nothing was lost. 

Exercises for §5.9. 

* * 
* 

by ip(u, b) = ub. 



#1. Let B = 



C SL(2, R), and define V : U 2 x B -> SL(2, ] 
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(a) Show ip is a polynomial. 

(b) Show ip is injective. 

(c) Show the image of ip is a dense, open subset S of SL(2, R). 

(d) Show i/i -1 is a rational function on S. 



[Hint: Solve 



"x y" 




"1 0" 




"a b ' 


z w 




u 1 




1/a 



#2. Show there is a (nontrivial) hyperbolic one-parameter subgroup 
{a s } of S, such that J(a s , if) > 1, for every almost- Zariski closed 
subgroup if of G that is normalized by {a s } U, and every s > 0. 
[Hint: Let 0: (7 — > S be a polynomial, such that (<f>(U),U) = S. 
For each H , we have J(m, if) = 1 for all u £ U, and the function 
J(<f)(u),H) is a polynomial on U. Although there may be infinitely 
many different possibilities for H, they give rise to only finitely many 
different polynomials, up to a bounded error. Choose u € U, such that 
\J(4>(u), H)\ is large for all H, and let a 1 = (f>(u)h be the hyperbolic 
part in the Jordan decomposition of 4>(u).] 



Notes 

Our presentation in this chapter borrows heavily from the original 
proof of M. Ratner [2, 3, 4], but its structure is based on the approach 
of G. A. Margulis and G. M. Tomanov [1]. The two approaches are 
similar at the start, but, instead of employing the entropy calculations 
of §5.6 to finish the proof, Ratner [3, Lem. 4.1, Lcm. 5.2, and proof of 
Lem. 6.2] bounded the number of small rectangular boxes needed to 
cover certain subsets of T\G. This allowed her to show [3, Thm. 6.1] 
that the measure \i is supported on an orbit of a subgroup i, such that 

• L contains both w* and a s , 

• the Jacobian J(a s , [) of a s on the Lie algebra of L is 1, and 

• L L + C Stab G (^i). 

Then an elementary argument [3, §7] shows that L C Stabc^)- 

The proof of Margulis and Tomanov is shorter, but less elementary, 
because it uses more of the theory of algebraic groups. 

§5.2. Lemma 5.2.1 is a version of [2, Thm. 3.1 ("R-property")] and 
(the first part of) [1, Prop. 6.1]. 

Lemma 5.2.2 is implicit in [2, Thm. 3.1] and is the topic of [1, §5.4]. 

Proposition 5.2.4' is [1, Lem. 7.5]. It is also implicit in the work of 
M. Ratner (see, for example, [3, Lem. 3.3]). 

§5.4. The definition (5.4.1) of S is based on [1, §8.1] (where S is 
denoted F(s) and S- is denoted U~(s)). 
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§5.5. Corollary 5.5.2 is [1, Cor. 8.4]. 

Lemma 5.5.3 is a special case of the last sentence of [1, Prop. 6.7]. 
§5.6. Proposition 5.6.1 is [1, Step 1 of 10.5]. 

§5.7. The proof of Prop. 5.7.1 is based on [1, Lem. 3.3]. 

The Claim in the proof of Thm. 5.7.2 is [1, Step 2 of 10.5]. 

The use of entropy to prove that if /x is G_-invariant, then it is G + - 
invariant (alternative (b) on p. 176) is due to Margulis and Tomanov 
[1, Step 3 of 10.5]. 

References to results on invariant measures for horospherical sub- 
groups (alternative (c) on p. 176) can be found in the historical notes 
at the end of Chap. 1. 

Exercise 4.7#8 is [1, Prop. 3.2]. 

§5.8. The technical assumption (5.8.9) needed for the proof of 
(5.2.4') is based on the condition (*) of [1, Dcfn. 6.6]. (In Ratner's 
approach, this role is played by [3, Lem. 3.1] and related results.) 

Excrs. 5.8#1 and 5.8#2 are special cases of [1, Prop. 6.7]. 

§5.9. That S/U is not compact is part of [1, Prop. 6.1]. 
That a s may be chosen to satisfy the condition J(a s ,H) > 1 is [1, 
Prop. 6.3b]. 

The (possible) nonergodicity of \x is addressed in [1, Step 1 of 10.5]. 
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Chapter 1. Introduction to Ratner's Theorems 

T" = R n /Z" = n-torus, 1 
G = Lie group, 2 
r = lattice in G, 2 

u l = unipotent one-parameter subgroup, 2 
(ft = u'-flow on r\G, 2 
[x] = image of x in L\G, 2 
T\G = {Fx | x e G}, 2 

SL(2,R) = group of 2 x 2 real matrices of determinant one, 



u* = 
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a* = 


V 
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T] t = horocycle flow = u'-flow on T\ SL(2,M), 3 

-ft — geodesic flow = a'-flow on T\ SL(2,R), 3 

/i = measure on G or on T\G, 4 

T = fundamental domain for T in G, 4 

Hg = G-invariant ("Haar") probability measure on T\G, 4 

H = hyperbolic plane, 8 

Stab// (Fx) = stabilizer of Fx in H, 9 

SO(Q) = orthogonal group of quadratic form Q, 14 

SO(m,n) = orthogonal group of quadratic form Qm.n, 14 
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v = orthogonal complement of vector v, 16 

ytis = 5-invariant probability measure on an S'-orbit, 20 
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/ = identity matrix, 33 
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|| • || = matrix norm, 33 

Matfcxfc(R) — { k x k real matrices }, 33 

x t = xu l (image of x under unipotent flow), 34 

x fa y — x is infinitesimally close to y, 36 

= push- forward of \x by the map ip, 39 
£t = vertical flow on X x X, 40 

/ii _L fj,2 = measures \i\ and H2 are singular to each other, 45 
g = element of Q with exp g — g 1 46 

Stabc(M) = stabilizer = {elements of G that preserve /x}, 48 
hp(g) = entropy of the translation by g, 52 
Prob(X) = { probability measures on X }, 58 

Chapter 2. Introduction to Entropy 

Tfj = irrational rotation, 69 

iBcm = Bernoulli shift, 71 

^Bakc = Baker's Transformation, 71 

S, S = partition, 76 

H(S) = entropy of the partition S, 76 

S V S = join of the partitions, 77 

H (S | S) = conditional entropy, 78 

E k (T,S) — information expected from k experiments, 78 

T e (S) = transform of S by T e , 78 

h(T,S) = entropy of T w.r.t. S, 78 

h(T) = h^T) = entropy of T, 78 

htop(T) = topological entropy of T, 79 

S+= \I™ =1 T*S, 83 

H(S | 5 + ) = conditional entropy, 83 

G + = horospherical subgroup, 86 

J(di W) — Jacobian of g on W, 87 

m x {\) = multiplicity of Lyapunov exponent, 88 

S(x) or S + (x) = atom of a partition, 89, 91 

Chapter 3. Facts from Ergodic Theory 

L X (X, /i) = Banach space of real- valued L 1 functions on X, 100 
E n = averaging set a~ n Ea n , 110 
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Chapter 4. Facts about Algebraic Groups 

R[#i,i, . . . , X£ t i\ = real polynomials in {xij}, 117 

gij = entries of matrix g, 117 

Q = some subset of R[xi,i, . . . , xe,e], 117 

Var(Q) = variety associated to set Q of polynomials, 117 

ID>£ = {diagonal matrices in SL(.£, R)}, 118 

\](> = {unipotent lower-triangular matrices in SL(£, R)}, 118 

StabgL(£,R) {v) = stabilizer of vector v, 118 

Stab SL (^ jR )(y) = stabilizer of subspacc V, 118 

S(Z) = ideal of polynomials vanishing on Z, 120 

H = Zariski closure of H, 121 

9u, 9h, 9e — Jordan components of matrix g, 125 

trace g = trace of the matrix g, 127 

A k B = semidirect product, 130 

gp - P {g), 133 

Rpm-i = real projective space, 140 

[v] = image of vector v in RP m_1 , 140 

Mat£ x£ (R) — all real I x £ matrices, 147 

0, f), U,S,l = Lie algebra of G, H, U, S, L, 147 

[sLj y] — 2LU — U2L = Lie bracket of matrices, 148 

Rad G = radical of the Lie group G, 149 

Adc = adjoint representation of G on its Lie algebra, 150 

ad fl = adjoint representation of the Lie algebra Q, 150 

Chapter 5. Proof of the Measure- Classification Theorem 

/x = ergodic w*-invariant probability measure on T\G, 159 
S = Stab G (X)°, 162 

L = subgroup locally isomorphic to SL(2,R) containing u', 165 
a s = hyperbolic one-parameter subgroup in Stabjv^u*})^), 165 
0, 5, [ = Lie algebra of G, S, L, 166 
9 = log g, 166 

v r = unipotent subgroup of L opposite to {u*}, 166 
0-i0o,0+ = subspaces of (determined by a s ), 166 
S-,So,S+ = subspaces of S (determined by a s ), 166 
G_, Go, G+, iSL, So, S+ = corresponding subgroups, 166 
U = S+, 166 

U = Lie algebra of U, 166 



190 



List of Notation 



G G 



V e G_G ?7, for all u e [/ j, 167 



S_ = SnG_, 167 

0+ U = complement to U in g + , 171 
0_ 9S- = complement to 5_ in 0_, 171 
G+QU = exp(0+eu), 171 
G_ 9 SL = exp(0_ 9 5-), 171 
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of an operator, 104 

representation, see 

representation, adjoint 
algebraic closure, 145 
algebraic group 

compact, 131, 136 

defined over Q, 15, 144-146 

over C, 152 

over R, 118 

real, 14, 15, 118-146, 152, 153 

simply connected, 154 

theory of, 15, 117, 150, 152-154, 
160, 176, 183, 184 

unipotent, see unipotent subgroup 
atom (of a partition), 76, 79, 80, 83, 

84, 88, 89, 91, 92, 94 
automorphism, 43 

expanding, 110, 111 

inner, 43 
averaging 

sequence, 110, 111, 113, 177-182 

set, 110, 179, 180 

babysitting, 35 

Baker's Transformation, 71-73, 75, 

81, 82, 84, 85, 95 
Banach space, 107 
basis (of a vector space), 27 
Bernoulli shift, 30, 31, 62, 70-73, 75, 

81, 84, 95 
bilinear form 

nondegencrate, 16 

symmetric, 16 
bit (of information), 76 
butterfly flapping its wings, 73 

Cantor set, 3 



Cartan Decomposition, 105 
centralizer, 46, 47, 124, 125, 130 
chain condition 

ascending, 120, 121 

descending, 120, 121 
Claim, 41, 185 
classical limit, 26 
coin tossing, 70, 74, 78 

history, 70 
commutation relations, 3, 8 
commutative algebra, 145 
complement (of a subspace), 151, 

171, 173 
component 

connected, 7, 14, 18, 57, 83, 119, 
121, 122, 138, 142, 160 

crgodic, see ergodic component 

G+-, 161, 167 

identity, 14, 49, 122, 129, 167, 168 

irreducible, 120, 121 

Jordan, 125 

of an ordered pair, 54 

transverse, 47, 183 
continued fraction, 61 
converge uniformly, 44 
convex 

combination, 107, 109 

set, 107 
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right, 2 
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30, 170 
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determinant, 3, 10, 14, 17, 18, 117, 
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diagonal embedding, 39, 54 
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over C, 89, 124, 127, 129 
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dichotomy, 6 

diffeomorphism, 85, 87, 88, 130, 149, 
171, 173, 180, 182 
local, 173 
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of a Lie group, 7, 62, 91, 109, 122, 
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of a manifold, 30, 51, 59, 119, 140 
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151, 175, 178 
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discrete time, 69 
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dough, 71, 73 
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dynamical system, 69 
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127, 132, 151, 175, 180 
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entropy, 69-98, 173 
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expected, 77 

of a dynamical system, 50, 52, 55, 

62, 72-98, 173-176, 184, 185 
of a flow, 27, 79 

of a partition, 75-80, 82, 83, 89, 
95 

topological, 79, 80, 95 
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equivalence relation, 32, 74, 140 
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theory, 69, 112, 154 
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exponential 
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function, 34, 127 
map, 136, 148 
exterior power, 134, 135 
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fiber, 52 
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invariant, 20, 108, 145 
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polynomial, 34, 43, 57, 117-155, 
169, 177, 178, 183, 184 
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rational, 117, 126, 138, 169, 170, 
183, 184 

regular, 126 

step, 44, 106 
Functional Analysis, 20, 107 
fundamental domain, 4, 5, 10, 11 

Galois 

automorphism, 145, 146 

cohomology, 154 

extension, 145 

group, 145 
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14, 27, 61, 146 
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point, 100, 163, 177, 180 

set, 38, 41 
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geodesic 

child, 35 

flow, 3, 4, 8, 9, 29, 30, 34-36, 47, 
51, 52, 61, 86, 99, 102 
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132, 137, 138, 149, 154 
arithmetic, 144, 154, 155 
discrete, see subgroup, discrete 
isometry, 14 
nilpotcnt, 60 

orthogonal (special), see SO(Q) 
semisimplc, 60 

simple (or almost), 28, 103, 129, 

130, 142, 146 
solvable, 60, 130, 149, 150 
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flow, 3, 4, 8, 9, 31, 32, 40, 50, 59, 

60, 62, 99, 102, 103 
in the hyperbolic plane, 8 
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element, 110, 124-155 
geometry, 12 
measure, 5 
metric, 26 
n-manifold, 12 
n-space, 12 
plane, 8, 11 

torus, see torus, hyperbolic 

impurity, 73 

induction on dimG, 176 
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close, see infinitcsimally close 

far, 173 

large, 172, 180, 181 
long, 34, 179 
small, see infinitesimal 
thin, 92 

infinitesimal, 35, 36, 62, 163, 172, 
173, 179-181 
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infinitcsimally close, 172, 179, 181 
information, 74, 76-80, 82, 83 
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subbundlc, 85 
under isomorphism, 80 
irrational 

number, 7, 15, 17, 19, 22, 70, 72, 

75, 81, 83, 123 
rotation, 69, 72, 74, 75, 81, 83, 84, 
95 

irreducible (Zariski closed set), 120, 
121, 124 

isometry, 8, 51, 72, 80, 84, 85, 88 
isomorphic, measurably, 30, 72, 82 
Iwasawa decomposition, 153 

Jacobian, 86, 87, 161, 164, 173-175, 

177, 180, 182-185 
join (of partitions), 77 
joining, see self-joining 
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component, see 

component, Jordan 
decomposition, 125, 127, 131, 153, 

184 
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Lagrange Multipliers, 80 
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entry, 47 
term, 36, 47, 54 
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compact, 24, 111, 136, 140, 153 

linear, 147, 152 

semisimple, 129-131, 133, 137, 
142, 149, 153, 154, 160, 177 
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solvable, see group, solvable 
unimodular, 10, 150, 164, 176 

Lie subalgebra, 148 

Lie subgroup, 87, 147 

linear 
form, 15 
functional, 134 

linearization method, 63 

locally compact, 44, 99 

locally isomorphic, 147, 148, 160 

location, 72, 73 

logarithm, 136 

lower-triangular matrices, 6, 10, 118, 
130, 165 

strictly, 136, 147 

unipotent, 118 
Lyapunov exponent, 88 
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manifold, 1, 4, 12, 22, 26, 34, 85, 
87-89, 119, 140, 142, 147 

map 

affinc, 31, 42, 43 

covering, 9, 12, 39, 142 

cquivariant, see cquivariant 

measure-preserving, 109 

proper, 12 
matrix 

entry, 126, 135 

nilpotent, 18, 125, 136, 141, 151 
norm, 33 

Mautncr Phenomenon, 112, 165, 183 

maximal subalgebra, 18 

measure, 4 

absolutely continuous, 62 
algebraic, 22 
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direct integral, 20, 23, 52 
Haar, 4, 10, 19, 51, 52, 55, 92, 99, 
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limit, 57, 58 

probability, 19, 58 

product, 39, 70 

regular, 4, 43 
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singular, 40, 45, 62 
measure-preserving, 42, 43, 52 
metric on T\G, 33 
mischief, 35 

mixing, 103, 106, 112, 177 
module, 162, 174 
monomial, 135 
multilinear algebra, 134 
multiplies areas, 52 

Noctherian ring, 120 
non-infinitesimal version, 36, 163, 
172, 173, 175, 177-179, 181, 182 
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Nonstandard Analysis, 36, 62 
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operator, 104 

subgroup, see subgroup, normal 
vector, 8 

normalizer, 8, 12, 28, 47-51, 55, 58, 
59, 87, 110, 128, 138, 142, 151, 
160, 162-173, 177-184 

nowhere dense, 170 

Nullstellensatz, 145 

one-point compactification, 58 
Oppcnhcim Conjecture, 13, 60, 61 
orbit, 23 

p-adic, 27, 61 
partition, 76 

countable, 82, 88, 89, 92 

generating, 81-84 

subordinate, 91 
past determines the future, 74, 82, 
95 

Pesin's Entropy Formula, 88 

plaque, 90-92, 95 

point 

accumulation, 1, 37—39 

extreme, 23, 107, 109 

mass, 52, 57 

nearby, 33-36, 46 

of density, 88 
polar form, 125 
polynomial, see 

function, polynomial 

divergence, 29, 34, 39, 57, 58, 62, 
162, 183 

speed, see speed, polynomial 
predictable, 74 
projective space, 140-142 
property 

H-, 62 

R-, 62, 184 

Ratncr, see Ratncr property 
shearing, see shearing property 
push- forward (of a measure), 39, 
112, 141, 143 

quadratic form, 13-18, 24, 60, 146 

degenerate, 13 

indefinite, 13, 14, 16 

nondegenerate, 13, 16, 24, 130 

positive definite, 17 
Quantum 

Mechanics, 26 



Unique Ergodicity, 26, 27, 61 
quaternion, 27 
quotient 

map, 37 

of a flow, 31, 32, 43 

R-property, see property, R- 
R-split 

element, 124 

torus, 131 
radical of a Lie group, 130, 149 
Radon-Nikodym derivative, 23 
Ratner 

Joinings Theorem, 40, 45, 48, 53 
method, 33 
property, 62 

Quotients Theorem, 32, 43, 55, 62 
Rigidity Theorem, 30-32, 62 
Theorem, see Ratner's Theorems 
Ratner's Theorems, 4, 13, 14, 22, 

24, 27, 28, 31, 32, 47, 50, 59-61, 

99, 109, 144 
Equidistribution, 20, 21, 26, 56, 

59-61 

Measure Classification, 21—23, 
29-32, 39, 40, 43, 45, 48-50, 54, 
56-62, 159 
Orbit Closure, 1-7, 12, 15, 18, 21, 
22, 56, 59-61 
relative motion, 35, 36, 46 

fastest, 47 
representation, 132, 169 

adjoint, 18, 150 
roots and weights, 130, 153 

scale like Lcbcsgue measure, 52 
second order effect, 47 
self-joining, 39, 40, 53 
diagonal, 39 

finite cover, 39, 40, 45, 53, 55 
finite fibers, 40, 46, 48, 49 
product, 40, 45 

scmidircct product, 130, 139 

scmisimple element, 153 

separable, 44, 45, 99, 108, 109 

set 

discrete, 17, 118 

invariant, 21, 23, 99, 101, 106, 

107, 110 
minimal, 42 
shearing property, 27, 29, 35-37, 39, 

40, 46, 48, 50, 62, 161, 162, 171 
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in SL(2,R), 47 
signature (of a quadratic form), 14, 
16, 24 

simply connected, 139, 154 
singular set, 59, 119, 153 
singularity, 119 

SL(2,R), 2-63, 86, 88, 90, 99, 102, 

103, 106, 122, 124, 127, 151, 

160, 165, 169, 183, 184 
SL(2,Z), 4, 6, 10, 12, 63, 99, 144 
SL(3,R), 6, 14, 18, 46, 60, 123, 168 
SL(3,Z), 6, 12, 14, 15 
SL(4,R), 42, 123, 139 
SL(£,C), 127, 129, 130, 150 
SL(£,Q), 142, 145, 146 
SL(^R), 14, 16, 25, 27-28, 30, 99, 

117-154, 159, 160, 170, 177, 178 
SL(£,Z), 25, 139, 142, 144-146, 154 
SL(V), 128, 134, 135 
SO(Q), 8, 14-18, 24, 25, 118, 130, 

133, 144, 146 
speed 

exponential, 34 
polynomial, 34, 59, 60, 162 
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of a measure, 48-51, 55, 56, 160, 

162-166, 179, 180 
of a point, 8, 9 

of a subspacc, 118, 133-135, 144 

of a tangent vector, 8 

of a vector, 17, 118, 144, 145, 169 
stretching, 85, 86 
subadditive sequence, 80 
subgroup 

commutator, 137 

conjugate, 30, 31, 42, 43, 55, 56, 
58 

connected, 2, 5, 6, 10, 12, 15, 
17-20, 53-61, 86-89, 103, 110, 
111, 122, 124, 128-139, 142, 
143, 146-152, 154, 159, 160, 
162-173, 176, 177, 179, 183 

dense, 142 

discrete, 1-5, 9-12, 28, 61, 105, 
106, 119, 130, 143, 152, 159 

horosphcrical, 60, 86-89, 176, 177, 
185 

opposite, 27 
Levi, 130, 152 

normal, 42, 50, 111, 128-131, 138, 
142, 149, 152, 164, 176, 177, 183 
of G x H, 42 



one-parameter, 2, 3, 7, 99, 103, 
104, 149, 160, 165 
diagonal, 4, 29, 51 
hyperbolic, 102, 151, 160, 177, 

183, 184 
unipotent, see 

unipotcnt subgroup, 
one-parameter 
unipotent, 62 
submanifold, 3, 6, 7, 59, 119, 147 
subspace, invariant, 133, 143, 151, 

152 
support 

of a function, 20, 26, 106, 110, 

111, 164 
of a measure, 21, 23, 24, 31, 40, 
49-51, 55, 56, 87, 88, 90, 108, 
141-143, 159, 160, 162-164, 176 
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curvature, 3, 9 
symmetric function, 145 

tangent space, 147 

technical assumption, 163, 164, 166, 

172, 173, 180, 181, 185 
Theorem 

Borel Density, 15, 58, 139, 142, 

143, 154 
Burnside, 133, 153 
Chevalley, 133-135, 141, 143, 178 
Choquet, 20, 23, 61, 107, 112 
Engel, 133, 153 
Pubini, 107, 108 
Inverse Function, 166, 173 
Jacobson Density, 153 
Lie-Kolchin, 130, 150 
Lusin, 37, 38, 44 
Margulis (on quadratic forms), 

13-15, 24, 61, 144 
Maximal Ergodic, 100 
Moore Ergodicity, 99, 103, 112, 
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Poincare Recurrence, 140, 141, 
154 

Pointwise Ergodic, 20, 38, 99-101, 
106, 109-113, 164, 172, 180 
Uniform, 111, 113, 164 

Ratncr, see Ratner's Theorems 

Stone- Weierstrass, 136 

Von Neumann Selection, 108 

Whitney, 14 

Witt, 17 
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torus 

algebraic, 129-132, 138 

compact, 131-133, 137, 138 

hyperbolic, 104, 131-134, 
137-139, 153, 183 

T™, 1, 19, 31 
totally isotropic subspace, 16 
trace (of a matrix), 117, 127 
transcendental, 123 
transitive action, 8, 25 
transpose, 27 

transverse divergence, 47-50, 54, 55, 
62, 161-163, 166, 167, 172, 173, 
178, 183 

trivial quotient, 32 

uniformly distributed, 18-20, 24, 26, 

56, 71, 72, 99, 100 
unipotcnt 
child, 35 

element, 22, 86, 124-146, 153 
flow, 2, 4-7, 12, 19-21, 27, 29-32, 
34, 39, 46, 50, 59-62, 86, 89, 

117, 144 
group, 129 

matrix, 3, 10, 42, 124, 129 
not, 3 
radical, 131 

subgroup, 48-51, 60, 109-111, 

118, 130, 131, 133, 136, 139, 
143, 145, 159, 162, 163, 177, 179 
maximal, 183 

one-parameter, 2, 3, 5, 6, 25, 
29, 46, 48-50, 58-60, 90, 102, 
143, 151, 166 

opposite, 51 
uniqueness, 125 
unit tangent bundle, 3, 8, 9 
unitary operator, 103, 104 
unpredictable, 72-74, 78 
upper half plane, 8 
u'-flow, see unipotent flow 

vanish at oo, 107 

variety, 117, 120, 133, 134, 144, 145, 
153, 176 
abclian, 152 
volume preserving, 51 

wandering around the manifold, 34 
warning signs, 35 
weak convergence, 104 
wcak*-compact, 107 
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weight space, 151, 166, 174, 175 

Weyl chamber, 105 

Zariski 

closed, 118-155, 170, 178, 183 
almost, 122-155, 160, 169, 176, 
183, 184 

closure, 15, 121-155, 167, 169, 
170, 176, 190 

dense, 139 



